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Manipulating 
Macromolecules 



T M I he greatest advances in molecular cell biol- 
m ogy in the recent past have been based on 
m the analysis and manipulation of macro- 
m molecules, particularly DNA. For years it 
^^L. was clear that many deep biological secrets 
were locked up in the sequence of bases in DNA, but 
obtaining the sequences of long regions of DNA — not to 
mention altering these sequences at will — seemed a dis- 
tant dream. An avalanche of technical advances in the 
1970s drastically changed this perspective. First, enzymes 
were discovered that cut the DNA from any organism at 
specific short nucleotide sequences, generating a repro- 
ducible set of pieces. The availability of these enzymes, 
called restriction endonucleases, greatly facilitated two 
important developments: DNA cloning and DNA se- 
quencing. 

Two DNA molecules can be joined enzymatically and 
thus restriction fragments of any DNA can be inserted 
into a variety of vectors, often plasmid DNA, to produce 
recombinant DNA. The recombinant molecules can be 
introduced into an appropriate cell population (most 
often bacteria) and cells containing recombinant DNA 
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Radioisotopes: The 
Indispensable Modern Means 
of following Biological 
Activity 
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Table 6-1 Commonly used radioisotopes 



Radioisotope 



Tritium (hydrogen 3) 
Carbon 14 
Sulfur 35 
Phosphorus 33 
Phosphorus 32 
Iodine 131 
Iodine 125 



Half-life 


Energy of 
emitted 
particle 
(MeV)* 


Mean path 
length in 
water I urn) 


Specific 

• + 

activity 

frnft/mAit 


Common 
specific activities 
for compounds 
(rau/mmoi) + 


12.35 yr 


0.0186 


0.47 


2.92 x 10 4 


10 2 - 10 5 


5730 yr 


0.156 


42 


62.4 


1 - 10 2 


87.5 days 


0.167 


40 


1.50 x 10 6 


1 - 10 6 


25.5 days 


0.248 




5.32 x 10 6 


10 - 10 4 


14.3 days 


1.709 


2710 


9.2 x 10 € 


10 - 10 5 


8.07 days 


0.806 




1.6 x 10 7 


10 2 - 10 4 


60 days 


0.035 




2.2 x 10 6 


10 2 - 10 4 



^Xruch^ 7116 maXimUm energy f ° r Cach emission is 8 iven - ^ P article emitted is a 0 particle, except in the 

+ TTie unit mCi (millicuries) is a measure of the number of disintegrations per time unit: 1 mCi = 2.2 x 10 9 disintegrations oer 
mA (milhatoms) is the atomic weight of the element expressed in milligrams. 8 P 

These values are for commercially available compounds that may have many carbon or hydrogen atoms. 
source: New England Nuclear, Boston. 
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Radioisotopes Are Detected by 
Autoradiography or by Quantitative Assays 

Two detection schemes for assaying incorporated radio- 
activity are in general use: 

1. In autoradiography, a cell or cell constituent is labeled 
and then overlaid with a photographic emulsion sensi- 
tive to radiation. Development of the emulsion reveals 
the distribution of labeled material. In whole cells, 
autoradiographic studies determine the original sites 
of the synthesis of macromolecules and their subse- 
quent movements within cells. For example, incorpo- 
ration of [ 3 H]thymidine identifies the nucleus as the 
major site of DNA synthesis and cell fractionation and 
histologic staining show most DNA is also in the nu- 
cleus (Figure 6-1). In contrast, the incorporation of 



M Figure 6-1 The technique of autoradiography, (a) A ra- 
diation-sensitive photographic emulsion containing silver salts 
(AgBr) is placed over labeled cells attached to a glass slide 
(for the light microscope) or to a carbon-coated grid (for the 
electron microscope). The cell regions containing the labeled 
molecules emit radioactive particles, along the tracks of 
which silver is deposited. When the photographic emulsion is 
developed, the silver deposited appears as dark grains under 
the light microscope and as curly filaments in the electron 
microscope, (b) These fibroblasts from Chinese hamsters 
were labeled with [ 3 H]thymidine for 1 h. Two of the cells 
were not synthesizing DNA during this time (the larger dark 
areas in their nuclei are nucleoli), but one cell was. Small 
black grains almost entirely cover that cell nucleus, indicating 
the new DNA is there. Part (a) redrawn from £. D. P. 
DeRobertis and £. M. F. DeRobertis, 1979, Cell and Molec- 
ular Biology, Saunders, p. 62; part (b) courtesy of D. M. 
Prescott. 
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labeled uridine into RNA has shown that most RNA is 
first made in the nucleus but that most RNA in cell 
fractions is located in the cytoplasm. Incorporation of 
labeled ammo acids has revealed that most protein is 
made in the cytoplasm. The transport pathway of pro- 
teins from synthesis to secretion was first documented 
by electron microscopic autoradiography, which al- 
lows each silver filament that results from a radioac- 
tive disintegration to be observed. 
2. In quantitative assays, cells are labeled either in vivo 
or in vitro and their constituents are isolated and puri- 
fied in various ways. The amount or type of radioac- 
tivity in these constituents is then measured— by a 
Geiger counter, which detects ions produced in a gas 
by the radioactive emissions, or by a scintillation 
counter, which counts the flashes of light generated by 
mixing the radioactive sample with a substance that 
fluoresces after absorbing the energy of a particle re- 
sulting from the decay of the nucleus of the radioactive 
atom. 

A combination of labeling and biochemical techniques 
is often employed. A cell constituent may be purified be- 
fore it is labeled and, after labeling, be subjected to exper- 
imental procedures. Autoradiography of the labeled 
products of such experiments— most often after they 
have been separated by gel electrophoresis (discussed 
later in this chapter) or by chromatography— is perhaps 
tnemost common experiment in all of modern biology 

The purpose of the experiment governs the choice of a 
radioisotope as well as the detection method. A labeled 
compound must have a high enough specific activity that 
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the "dioactivity m the cell fraction is significant enough 
to be studied when the compound is incorporated into 
cells For example 3 H-labeled nucleic acid precursors are 
ava. able ,n much higher specific activities than Re- 
labeled samples are; the former allows RNA or DNA to 
be adequately labeled after a shorter time of incorpora- 
tion or in a smaller cell sample. 

In autoradiographic studies, the energy in the particles 
released by radioactive disintegrations affects the experi- 
menter s ability to localize the site at which the radioac- 
tivity is incorporated. For example, the /3 particles emit- 
ted by P are so energetic (see Table 6-1) that the streaks 
they make on photographic film can be as long as 1 mm 
much longer than the diameters of individual cells. 3 H is 
highly preferred for locating radioactive substances or 
structures in cells: the track created on photographic film 
by the fi particle released by 3 H decay is only about 
0.47 ^ long; thus 3 H-labeled structures can be located 
withm cells to an accuracy of about 0.5-1.0 M m, or 
about one-fifth the diameter of the nucleus of a mamma- 
lian cell. 

Pulse-Chase Experiments Must Be 
Designed with Knowledge of the Cell's 
Pool of Amino Acids and Nucleotides 

In many experiments using radioactive metabolic mate- 
rial, a labeled compound is added to cells and the path of 
the labeled compound can then be traced as it moves 
through various compartments or molecules within cells 
One type of experiment, the pulse-chase experiment, uti- 
lizes the brief addition (a pulse) of a labeled compound, 
followed by its removal and replacement (the chase) by 
an excess of unlabeled compound; the cells or cell constit- 
uents are examined at various times thereafter to monitor 
the radioactivity incorporated during the pulse 

Before an amino acid, a nucleoside, or a phosphate ion 
(for example) is incorporated into a protein or a nucleic 
acid itenters the cell's poolof molecular building blocks— 
a collection of small molecules free to diffuse throughout 
the cytoplasm and nucleus of the cell but not necessarily 
tree to diffuse into or out of membrane-bound organelles 



< Figure 6-2 A cell's pool of small soluble molecules- 
ammo acids (aa) and nucleotides (dNTP and rNTP) may be 

separated from the macromolecules (DNA, RNA, and pro- 
teins) by adding cold acid, usually trichloroacetic acid (TCA) 
which destroys the cell structure and precipitates all macro- ' 
molecules. Centrifugation then deposits the macromolecules 
in a pellet, leaving the amino acids and nucleotides in the 
supernatant. The rate at which cells take up labeled mole- 
cules and incorporate them into macromolecules can be de- 
termined by taking such samples at frequent intervals after 
the addition of labeled amino acids or nucleotide precursors 
to the cell-culture medium. 
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(a) 




▲ Figure 6-3 (a) If growing cells are exposed to a medium 
containing labeled amino acids, it takes about 5 min for the 
amino acids in the cell pool to reach the maximum specific 
activity. The accumulation of radioactivity in proteins starts 
more slowly, because the label must be incorporated into the 
amino acid pool first. However, if medium containing unla- 
beled amino acids is used instead, the incorporation of radio- 
activity into proteins stops within a few minutes due to the 
rapid equilibration between amino acids inside the cells and 
in the medium. Thus a marked pulse-chase effect is seen, 
(b) A pulse of labeled uridine is incorporated into UTP in the 
cell pool in about 10 min, and the pool can be diluted some- 



<b) 
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what by excess unlabeled nucleosides outside the cells (drop 
in specific activity at 25 min). However, it takes much longer 
for a chase with unlabeled uridine to level off the amount of 
radioactivity incorporated into RNA (reduce the ongoing in- 
corporation of radioactivity to 0). This is because the labeled 
uridine in the pool has been phosphorylated and is unable to 
escape to the medium, and there are almost 20 percent more 
uridine nucleotides in the pool than in cellular RNArrAH of - 
the labeled uridine is eventually incorporated into the RNA, 
but only after several hours. Thus no marked pulse-chase ef- 
fect is seen. 



such as the mitochondria or chloroplasts. Depending on 
the growth conditions of the cell the quantities of compo- 
nents of the pool can vary. Likewise, the rates at which 
different molecules are absorbed, utilized, and secreted by 
the cell can also vary (Figure 6-2). 

Because of the rapid exchange of amino acids between 
the pool and the medium, a clear pulse-chase effect can be 
achieved with them. The acid-soluble pool can be made 
to contain radioactive amino acids in a few seconds, and 
they can be removed just as quickly (Figure 6-3a). 

Ribonucleosides and deoxyribonucleosides, however, 
become phosphorylated soon after they enter the cell 
pool, and phosphorylated compounds do not generally 
leave the cell. Thus labeled nucleosides can enter the cell, 
but no equilibrium is established between the nucleic acid 
precursors in the medium and their phosphorylated coun- 
terparts in the cell. Nevertheless, a practically useful 
pulse-chase effect can be obtained in experiments with 
radioactive deoxyribonucleosides, because the deoxyri- 
bonucleotide content of the cell pool is sufficient for only 
a few minutes of DNA synthesis. Labeled thymidine, for 
example, can be satisfactorily chased even though it is 
phosp horylated, because an amount of thymidylate 
(TTP) equal to that in the pool is taken up every few 
minutes by replicating DNA. 

Labeled ribonucleosides behave differently, because it 
takes several hours for enough RNA synthesis to occur to 



consume the content of the ribonucleotide pool in animal 
cells. In most cultured animal cells, the pool does absorb 
a pulse of labeled ribonucleosides quickly, say within 
10 min. However, a marked chase response (one that oc- 
curs within a few minutes) is not possible. Although the 
addition of unlabeled ribonucleosides to the exterior 
medium may further expand the ribonucleotide content 
of the cell pool, dilute the label within it, and decrease the 
rate of RNA labeling, the amount of incorporated label 
does not clearly level off until several hours after the 
chase begins (Figure 6-3b). 

In planning and interpreting experiments that use la- 
beled precursors of proteins, DNA, or RNA to study 
macromolecular synthesis, these characteristics of small 
molecules in the soluble pool must always be borne in 
mind. 

Labeled Precursors Can Trace the 
Assembly of Macromolecules and Their 
Distribution in a Cell 

When a radioactive building block first enters a cell, it 
can only label the macromolecules that are in the process 
of being constructed. For example, if a radioactive amino 
acid is added to a culture, the nascent (unfinished and still 
growing) protein chains are the first proteins to be la- 
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A Figure 6-4 Labeled radioactive precursors (red) first 
appear in nascent macromolecules. As time passes, molecules 
that contain more radioactive label are completed. At the end 
of an interval equivalent to the synthesis time of a macro- 
molecule, the total amounts of radioactivity in all finished 
and all unfinished molecules are equal. 



beled. As time passes, an increasing number of completed 
chains contain the radioactive label. The time required to 
form a specific macromolecule can be estimated by sam- 
pling a labeled cell culture at very short intervals to com- 
pare the amount of radioactivity in all nascent macromol- 
ecules still attached to the templates with the amount in 
all free (complete) macromolecules. The first finished 
chains obtained after the label is added contain only a 
small amount of the label, because they were almost com- 
pleted before it was introduced. Each nascent chain ini- 
tially also contains a small amount of label; as time 
passes, however, more label accumulates in newly fin- 
ished chains and the nascent chains become completely 
labeled. At this point, there is an equal amount of label in 
the finished and nascent chains (Figure 6-4). The time 
elapsed since the label was added is equal to the time 
required for the synthesis of one chain. 



The Dintzis Experiment Proved That Proteins Are 
Synthesized from the Amino End to the Carboxyl 
End Other very important facts — the cellular locus of 
synthesis of a macromolecule and the direction of its 
growth — can be determined by labeling growing chains. 
Indeed, the analysis of newly finished chains was used by 
Howard Dintzis in a classic experiment demonstrating 
the step-by-step formation of protein chains from the 
amino terminus to the carboxyl terminus. Over 90 per- 
cent of the protein synthesized by reticulocytes (the next- 
to-final stage in the differentiation of red blood cells in 
the bone marrow of mammals) consists of the a- and /3- 
globin chains that form the protein part of hemoglobin. 
(Hemoglobin is composed of four globin chains: two a 
and two Dintzis exposed reticulocytes to radioactive 
amino acids and then, at short intervals, collected the fin- 
ished chains. He separated the a and j8 chains and di- 
gested each with trypsin, an enzyme that attacks on the 
carboxyl side of arginine and lysine residues to produce a 
specific set of fragments for each chain which can be sep- 
arated. Dintzis knew the sequence of amino acids in both 
globin chains as well as the position of each fragment 
within the globin chains. 

Dintzis reasoned that the first completed chains to con- 
tain the radioactive label would be those that were almost 
complete when the label was added. Thus the first por- 
tion of the finished chains to contain label would be near 
the end at which chain synthesis finished and, by exten- 
sion, the last portion of the finished globin chains to be- 
come labeled would lie at the end where chain synthesis 
started. The results showed that the radioactive label al- 
ways appeared in the tryptic fragments in a certain order — 
in the carboxyl-terminal fragment first and the amino- 
terminal fragment last, with intermediate fragments be- 
coming consecutively labeled in the order in which they 
lay between the two termini (Figure 6-5). From this, 
Dintzis deduced that synthesis begins at the amino termi- 
nus of each chain and moves in a step-by-step progression 
to the carboxyl end of the chain. 

Whereas Dintzis studied the labeling of newly finished 
molecules, other workers have studied nascent molecules. 
(Experiments on the labeling of nascent RNA and DNA 
are described in Chapters 8 and 12.) The logic of these 
studies parallels that of the Dintzis experiment: the short- 
est labeled molecules in a nascent set will be those whose 
sequence is near the start site; increasingly longer mem- 
bers will contain additional sequences progressively more 
remote from the start site. 

Determining the Sizes of 
Nucleic Acids and Proteins 

Whereas the sequence of the monomers in a protein or nu- 
cleic acid ultimately determines the functional capacity of 
the polymer, the most useful physical characteristic in the 



DETERMINING THE SIZES OF NUCLEIC ACIDS AND PROTEINS 195 



Add labeled amino acids to cells 
aa 

aa 

aa & 

Extract completed globin 
chains at intervals from 
\1/ labeled cells 



B 



COOH 




Digest with 
trypsin 




V 



Separate fragments 
chemically 








Assay radioactivity 



Pattern of appearance 
of label 

B 



Direction of synthesis 

A Figure 6-5 A diagram of Howard Dintzis' classic experi- 
ment showing the growth of polypeptides from amino to car- 
boxyl end. Soon after radioactively labeled amino acids were 
added to a suspension of reticulocytes, finished labeled glo- 
bin chains were released from ribosomes. After labeling, sam- 
ples of released chains were taken at frequent intervals and 
cleaved into fragments by digesting the protein with the pro- 
tease trypsin; radioactivity in the fragments was then as- 
sayed. In the first sample of finished chains (t t ) 9 all radioac- 
tivity (red) was located in the E fragments at the carboxyl 
end. In samples * 2 -r 5 , fragments were labeled in the order D, 
C, B, and finally A. The direction of synthesis was therefore 
A E - Thus Dintzis concluded that protein synthesis begins 
at the amino terminus and progresses to the carboxyl termi- 
nus. [See H. Dintzis, 1961, Proc. Nat'l Acad. Set. USA 
47:247.] 



analysis of a polymer is its unique length. It is relatively 
easy to separate molecules by length; it is more difficult 
(for proteins) or virtually impossible (for nucleic acids) to 
separate molecules by using chemical differences based 
on sequence differences. Therefore, the length (size) of a 
protein, RNA, or DNA molecule is one of the most fre- 
quent measurements in molecular cell biology. In the fol- 
lowing sections, we briefly outline the principles of mo- 
lecular separation according to size and illustrate their 
use. The newer techniques are so simple and effective that 
they may not be appreciated as "physics in action." Stu- 
dents are encouraged to learn the principles of physical 
chemistry that underly these crucial techniques (see the 
references for this section, particularly those of Cantor 
and Schimmel). 

Centrifugation Is Used to Separate 
Particles and Molecules That Differ in 
Mass or Density 

Two basic uses of centrifugation recur in the experiments 
described in this book: (1) separation of particles accord- 
ing to their mass, and (2) separation according to their 
density. We shall discuss each use in turn. 

Rate-Zonal Centrifugation When particles or mole- 
cules are layered on top of a liquid column in a tube and 
subjected to centrifugation, they migrate down the tube 
at a rate controlled by the centrifugal force, the mass of 
the particles, the difference between the densities of the 
particles and the suspending medium, and the friction 
between the particles and the suspending medium (which 
depends on the shape of the particles). For example, RNA 
molecules of similar average shape and density separate 
in a centrifugal field almost solely according to mass, or 
chain length. After centrifugation is complete, different- 
sized molecules are found in different zones of the centri- 
fuge tube; this separation technique is commonly called 
rate-zonal centrifugation (Figure 6-6a). Samples are cen- 
trifuged just long enough to separate the molecules of 
interest. If they are centrifuged for too short a time, the 
molecules will not separate sufficiently; if they are centri- 
fuged much longer than necessary, all of the molecules 
will end up in a pellet at the bottom of the tube. 

To prevent stirring the contents of the centrifuge tube 
during acceleration and deceleration, the liquid column 
through which particles are sedimented is often stabilized 
by a sucrose solution that is more concentrated (and thus 
denser) at the bottom than the top of the tube. For this 
reason, the technique is sometimes called "sucrose den- 
sity-gradient centrifugation," but this terminology is er- 
roneous because the property mainly responsible for the 
separation of particles by rate-zonal centrifugation is not 
density, but mass. 

Although the sedimentation rate is strongly influenced 
by particle mass, this technique is seldom effective in de- 
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M Figure 6-6 Two centrifugation techniques are widely 
used to separate different subcellular particles, different types 
of molecules, or the same type of molecules of different 
mass. When they have separated sufficiently, the centrifuge is 
stopped and samples are collected from a hole punctured in 
the bottom of the tube. The different samples may be identi- 
fied by various assays, (a) Rate-zonal centrifugation separates 
particles or molecules that differ in mass but may be similar 
in shape and density (for example, RNA molecules). Here, 
two particles of different mass have been labeled before cen- 
trifugation. (b) Equilibrium centrifugation allows the separa- 
tion of particles that differ in density (for example, because 
they contain different ratios of protein, DNA, or RNA). The 
particles may or may not differ in mass and shape. Particles 
move up or down through a density gradient established 
when the centrifugal force acts on a dissolved salt such as 
CsCl. At equilibrium, the particles in the tube collect at lev- 
els at which the density of the solution equals their own 
density. 



termining exact molecular weights because variations in 
shape also affect sedimentation rate. The exact effects are 
hard to assess, especially for proteins and single-stranded 
nucleic acid molecules that can assume many complex 
shapes. Nevertheless, rate-zonal centrifugation has 
proved to be the most practical method for separating 
many different types of polymers and particles. 

While an analytical ultracentrifuge (one equipped with 
optical instruments that can record light absorption in the 
ultraviolet range, for example, where nucleic acids absorb 
strongly) is in motion, the sedimentation rate can be mea- 
sured by photographing the moving boundaries of sedi- 
menting layers of molecules. Modern ultracentrifuges 
reach speeds of 60,000 revolutions per minute (r/min) or 
greater and generate forces sufficient to sediment parti- 
cles with masses greater than 10,000 daltons. For a parti- 
cle located 6 cm from the rotational axis of a centrifuge, 
60,000 r/min corresponds to a centrifugal force of 
250,000 times gravity (250,000g). However, even at such 
tremendous forces, quite small particles with masses of 
5000 daltons or less diffuse too freely to settle uniformly 
through a centrifugal field. 

Equilibrium Density-gradient Centrifugation A 
density gradient can be established throughout the sus- 
pending medium before centrifugation. Alternatively, the 
force of centrifugation itself can be used to establish a 
density gradient; this separation technique is called equi- 
librium centrifugation (Figure 6-6b). In both cases, the 
density of the medium should range from less dense than 
the particles to be separated to more dense than these 
particles. During centrifugation, the particles or mole- 
cules in the tube move up or down to the level at which 
the density of the medium is equal to their own density; at 
this level, they are said to be isopycnic with (equally as 
dense as) the medium. Even under tremendous centrifugal 



force, a particle will not sediment through a gradient re- 
gion denser than itself. 

Probably the most commonly used material for making 
density gradients in equilibrium centrifugation is a water 
solution of cesium chloride (CsCl). The cesium ion (Cs + ) 
is so compart that it sediments slightly in the powerful 
fields created in modern ultracentrifuges. A gradient is 
thereby established, with more Cs + (and more Cl~, which 
follows the Cs + to neutralize the charge) toward the bot- 
tom of the tube. In a typical ultracentrifuge run, the liquid 
will be about 0.02 g/mL denser at the bottom of the tube 
than at the top. Thus molecules that differ in density by 
even a fraction of 0.02 g/mL can easily be separated by 
this technique. The densities of protein, DNA, and RNA 
in a solution of CsCl are approximately 1.3, 1.6-1.7, and 
1.75-1.85 g/mL, respectively, so these molecules are eas- 
ily separated from one another. The densities given here 
are higher than those of the same macromolecules in cells, 
because ions in a CsCl solution bind to proteins and nu- 
cleic acids; the densities of all macromolecules without 
bound ions are almost equal: 1.25-1.3 g/mL. In a CsCl 
solution, the Cs + binds to DNA mainly at phosphate 
groups; it binds to RNA both at phosphates and the hy- 
droxyl groups of riboses, thus increasing the density of 
RNA more than that of DNA. Proteins, which have much 
less average charge than nucleic acids do, bind less ce- 
sium. 

Proteins or nucleic acids in which 13 C or 15 N is substi- 
tuted for 12 C or 14 N in amino acids or nucleotides also 
can be separated from their normal counterparts. For 
example, since proteins are 14 percent nitrogen and 15 N 
is 15 /i4 times as dense as 14 N (Table 6-2), a protein substi- 
tuted completely with 15 N is about 1 percent denser than 
the normal protein — a sufficient difference to allow the 
normal protein to completely separate from the substi- 
tuted one. Thus, when cells are grown in a medium con- 
taining heavy amino acids or nucleotide precursors, it is 
possible to physically separate molecules made by the 
cells before and after the addition of the heavy isotope 
(see Figure 12-1). 



Table 6-2 Commonly used heavy isotopes and their natural 
(more abundant) counterparts* 



Heavy 


Atomic 


Natural 


Atomic 


isotope 


mass 


isotope 


mass 


Deuterium 


2.01 


Hydrogen 1 


1.01 


(hydrogen 2) 






Carbon 13 


13.01 


Carbon 12 


12.00 


Nitrogen 15 


15.00 


Nitrogen 14 


14.01 


Oxygen 18 


18.00 


Oxygen 16 


16.00 



* The greater density of heavy isotopes is due to the presence of one 
or more additional neutrons in their nuclei. The extra neutrons do 
not affect the chemical bonding properties of the atoms but do affect 
their mass. 
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The Sedimentation Constant When a particle sus- 
pended m a medium is subjected to centrifugal force t 
w.H move if ,ts density d is greater than the density o7rh e 
surr 0undm medium d 0 . The speed of movement in a s£ 
t.onary medium is proportional to the gravitational accel- 
erate g; in a centrifugal field, g is replaced by the cen- 
tnfugal acceleration c, which is equal to (2™)l, where 
« • » the revolutions per unit of time and , is the dis7an« 
of the particle from the axis of rotation. 

medL^T? ^ k encounters f ri«ion with the 

medium As it accelerates, its velocity v increases, increas- 
ing riction. The frictional force <f> * equal to / J whereV 
» a frictional coefficient related to the shape ofdne pani- 
cle. For a spherical particle, f= 6m ,r, where r, is the vis- 
cosity of the medium and r is the radius of the particle 
The ve ocity increases until the frictional force balances' 
he centrifugal force P e , after which time the particle con 
tinues to move at a uniform velocity. Stoke* e q ZZs 

A /iirr > {d - d 0 )c = Girqrv 



9vt) 



2c(d - d 0 )) 

fdlf^ t i lem ° ti ° n of s P herica l Particles in a fluid under 
.deal conditions (larger particles than solvent molecules- 
no interaction among particles; no disturbance due to 
convection, or heat transfer). 

The sedimentation constant vie = s is characteristic for 
a given particle in a given medium at a given temperature 
If r and * are expressed in cm, g or c in cm/sec 2 , w in r/sec 

then the sedi — 

rw _»_ V^{d - dp) m[l - (dpld)} 
c 6ir7}r j 

where m is the mass (in g) of the spherical particle and s 
is expressed in sec or in suedbergs (S = 10"" Z * J 
standi conditions of sedimenfadon in water at ' 20°C 

ctf R Standar t C ^ friaion ° f the Anting parti-' 
cles). Because the centrifugal force and the density and 
viscosity of the medium can all be n^iucd.^kS 
equauons taken together, can be used to est imlTll 

22^- nMtt f the spherical partide > if ItS dens ^ -d 

sedimentation velocity are measured in a centrifuge The s 
values for a representative set of biologically impo tan 
particles are given in Table 6-3. important 

Electrophoresis Separates Molecules 
According to Their Charge-Mass Ratio 

Molecules in a mixture can be separated, or resolved, ac- 
cording to size by electrophoresis, a technique dependent 
on the fact that dissolved molecules in an dccSTfidd 
move at a speed determined by their charge-mass ratio 



fo™ 3 ^ edi ? entation ««mants and molecular weights 
for some molecules and other parades 8 

Molecule or 
partide 



PROTEINS 

Cytochrome c 
Myoglobin 
Hemoglobin (a^) 
Fibrinogen 

RNA 

Transfer RNA (average) 
Ribosomal RNA: 

£. colt, small 

£. colt, large 

Human, small 

Human, large 

PARTICLES 

Ribosome (human) 
Poliomyelitis virus 
Bacterium 



Sedimentation Molecular weicht 
constant (S)* x l<r 3 



1.7 
2.0 
4.5 
7.6 

4.0 



16 
23 
18 
28 



13.4 < 
16.9 
64.5 
340 

25-27 

550 
1100 

660 
1700 




s trS; " u 0 ,CCUleS h3Ve * e Same ™» and 
shape the one with the greater charge will move faster 

toward an electrode. Many successfuf variations of I £ 

T general U8e ; 46 separation of s -ii 

example A snia lM "? ™ d nuc, ^ides, is one 

filte7 D ant ™ l' 0 " SampIe iS dep ° sited on a S *P °< 
solS Substrate > which is then 

soaked with a conducting solution. When an electric field 
s applied at the ends of the strip, small molecules dissdve 
in the conducting solution and move along the strip at a 
rate corresponding to their charge P 

Nucleic acids in solution generally have a negative 
charge because their phosphate groups are ionize? Au 
they migrate toward a positive electrode. However nu 

idenrf T ,eCUleS C ° nSiSting ° f lon 8 chai ™ have ahnost 
identical charge-mass ratios, whatever their length be 

mass e Sfo 'mat ""^f the Same cha ^ «■ 
mass. Also, many proteins that differ in shape and mass 

have almost equal charge-mass ratios. Therefore "the 
electrophoresis of nucleic acids and proteins were s rnolv 
carried out in solution, little or no separation of Ze- 
cuks of varying lengths would occur 

Despite these difficulties, electrophoretic separation 
according to chain length has become amazingly reliable 
Molecules are now most commonly subjected 1 to W 

tiianlr ^ SemiS ° ,id SUS P Cnsion in wa *'>> ™her 
than a liquid solution. The size of the pores in such eels 

units the rate at which molecules JL move Tough 
them. Nucleic acids with identical charge-mass ratios sep- 
arate according to length, with the longer ones moving 
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Negatively charged nucleic acids 
or SDS-protein conplexes 




Place mixture on an agarose 
or polyacrylamide gel 
Apply electric field 




Gel particle 
Pores 



Molecules move through pores 
in gel at a rate inversely 
proportional to their chain length 



^^^^ 



A Figure 6-7 Gel electrophoresis is carried out by pouring 
a liquid containing either melted agarose or chemically 
treated polyacrylamide into a cylinder (for a round gel) or 
between two flat, parallel glass plates 1-2 mm apart. As the 
gel solidifies, it forms interconnected pores, or channels, 
whose size depends on the concentration of agarose or poly- 
acrylamide. The substances to be separated are then layered 
on top of the gel (or at one edge of it if it lies between two 
plates), and an electric current is passed through the gel. In 
usual laboratory practice, the migration of RNA or DNA 
depends on the charges on the phosphates: at neutral pH, a 
nucleic acid bears one negative charge per phosphate. Pro- 
teins can be separated by binding sodium dodecyl sulfate 
(SDS) to their amino acid residues, which contributes ap- 
proximately one negative charge per residue. If all the parti- 
cles have about the same charge-mass ratio, they move 
through the gel at a rate inversely proportional to their chain 
length. 



900 k 



A Figure 6-8 Pulse-field gel electrophoretic separation of 
large DNA molecules. In this technique, DNA molecules are 
moved first in one direction by application of an electric 
field. As they move, the molecules stretch out lengthwise in 
the direction of the field. The current is then stopped for a 
short time, and the molecules begin to "relax" into random 
coils; the time required for relaxation depends on the length 
of a molecule. The electric field can then be reapplied at 90° 
to the first direction or opposite to the first direction. Longer 
molecules relax more slowly than shorter ones, and so take 
longer to start moving in the new direction. Repeated alter- 
nation of field direction thus separates the molecules between 
the two directions and makes it possible to separate giant 
DNA molecules of 10 6 base pairs and more. The "ladder" in 
the lane 2 shows concatemers (linked units) of bacteriophage 
A DNA in which each unit is 48.5 kb long (the band on the 
bottom is a single unit). Comparison with this ladder allows 
calculation of the length of other long DNA fragments. 
Lane 1 shows individual DNA molecules that each represent 
one chromosome from Saccharomyces cerevisiae; lane 3 
shows a restriction digest with enzyme No*I that was used to 
map the £. coli chromosome (see Figure 5-9). [See C. L. 
Smith et al., 1987, Science 236:1448; C. L. Smith et al., 
1987, Nuc. Acids Res. 15:4481.] Photograph courtesy of 
C. L. Smith, 



more slowly (Figure 6-7). Even very long nucleic acids 
(chains containing 10,000-20,000 residues) that differ in 
length by only a few percentage points can be separated. 
In mixtures containing chains of 500 nucleotides or less, 
each chain length can be resolved, which has made DNA 
sequencing possible. 

By employing the new technique of pulse-field gel elec- 
trophoresis, different-sized double-stranded DNAs in the 
range of 1-10 million base pairs (bp), or 1-10 megabases 
(Mb), can now be separated (Figure 6-8). Electrophoretic 
migration is begun in one direction; then the current is 
briefly stopped and reapplied at a 90° angle or in the 
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opposite direction. These long molecules tend to align 
along the electric field when the current is on and to relax 
when it is off. Relaxation time is affected by pores in the 
gel: longer molecules take longer to relax, and so respond 
more slowly as the current is switched, than shorter ones 
do, allowing the chains to be separated. This technique is 
very important for purifying long DNA molecules. It is 
required for the analysis of cellular chromosomes, which 
range from the smallest yeast chromosomes (about 5 x 
10 5 bp) to the largest animal and plant chromosomes (2 
or 3 x 10 8 bp). 

Protein chains also can be separated according to 
length. Before and during electrophoresis, the proteins 
are continuously exposed to the detergent SDS (sodium 
dodecylsulfate, a common commercial cleaning agent 
found in toothpaste). Approximately one molecule of de- 
tergent binds to each amino acid. At neutral pH, the de- 
tergent is negatively charged; the adjacent negatively 
charged SDS molecules repel one another, forcing the 
proteins with bound detergent into rodlike shapes en- 
dowed with similar charge-mass ratios. Proteins in this 
state are said to be denatured. As with nucleic acids, 
chain length (which reflects mass) is the determinant for 
the separation of proteins by electrophoresis through poly- 
acrylamide gels (Figure 6-9). Even chains that differ in 
molecular weight by less than 1 percent can be separated. 

Gel Electrophoresis Can Separate Most 
Proteins in a Cell 

The traditional biochemical approach first to enzyme de- 
tection and ultimately to detailed enzyme chemistry is to 
detect enzymatic activity in a sample from a natural 
source and isolate the proteins that catalyze the activity. 
Biochemical methods of separating pure proteins from 
natural mixtures rely on differences in sedimentation rate 
or in charge change related to varying salt concentrations 
or pH. This causes the protein to bind differentially to 
various substances (e.g., cellulose products) and makes 
chromatography possible. 

However, many experiments in molecular biology are 
designed to enumerate the polypeptides formed in a cer- 
tain cell at a certain time, rather than to detect active 
enzymes or determine their concentrations. Sometimes 
just the presence of a given protein is to be detected any- 
where within the cell, without purifying the protein. Or it 
may be important to compare the synthesis rate of a pro- 
tein or a set of proteins with that of all other proteins in 
the cell, again without isolating any particular protein. 
Gel electrophoresis can often accomplish these aims. 

Two-dimensional Gels Electrophoresis of all cellular 
proteins in one direction through a column or a thin rec- 
tangular SDS gel reveals only the major proteins. If these 
proteins are of interest or if a cell is producing large 



HeLa 293 HeLa 293 




12 3 4 



▲ Figure 6-9 Resolution of proteins by one-dimensional 
gel electrophoresis. The proteins of two human cell lines — 
HeLa, a human cervical cancer cell, and 293, a virus-trans- 
formed embryonic fibroblast — were dissolved in SDS and 
subjected to electrophoresis. The newly made proteins are 
visible in lanes 1 and 2 by autoradiography, because the cells 
were labeled with [ 35 S] methionine, and in lanes 3 and 4 by 
the dye Coomassie blue, which stains all proteins. The desig- 
nations 72 K, 68 K, etc., indicate the positions of marker 
proteins (proteins of known sizes) with molecular weights of 
72,000, 68,000, etc. The major proteins in these two cell 
types are obviously quite similar. Photographs courtesy of 
J. R. Nevins and C. Lawrence. 



amounts of specific proteins (as occurs during viral infec- 
tion), then this one-dimensional analysis may suffice. 

Resolution of virtually all proteins in the cell can be 
accomplished in a two-dimensional gel, which separates 
the proteins in a sample first by charge and then by size 
(Figure 6-10). Separation by charge is carried out by isoe- 
lectric focusing (IEF). A protein that has not been dena- 
tured with SDS has a characteristic overall charge on its 
surface, which varies with pH. When placed in a gradient 
of pH and subjected to an electric field, a protein will 
migrate to the pH at which its overall surface charge is 
neutral and remain at this isoelectric point. Proteins sepa- 
rated in a gel of this type can, while still in the gel, be 
layered on top of another gel soaked with SDS; thus the 
proteins can be separated by electrophoresis in a second 
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(a) 
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dimension 
(by charge) 



Protein 
mixture 



Apply first gel 
to top of second 



(b) 



IEF 



Isoelectric 
focusing (IEF) 
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Separation 
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SDS 

electrophoresis 



▲ Figure 6-10 (a) The preparation of two-dimensional 
protein gels by isoelectric focusing (IEF) followed by electro- 
phoresis, (b) Labeled proteins can be detected by autoradiog- 
raphy. Each spot represents a single polypeptide. The spots 
are elongated horizontally because the average charge on a 
protein molecule varies somewhat during IEF. These patterns 
are reproducible, so that changes in individual proteins can 
* be- -detected. The proteins are from cells growing on a normal 
medium supplemented with isoleucine (top) and from cells 
placed for a brief period in a medium that lacks isoleucine 
(bottom). Certain spots (circles) are absent in the bottom 
photograph or are much fainter than in the top photograph; 
these differences represent changes in the synthesis pattern of 
cell proteins in response to amino acid starvation. From 
P. H. O'Farrell, 1978, Cell 14:545. Copyright M./.T. 
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dimension on the basis of size. As many as several thou- 
sand different protein chains — virtually the total protein 
content of a cell — can be detected and separated by this 
technique. Two-dimensional gels are very useful in study- 
ing the expression of various genes in differentiated cells. 

There are two widely used methods of detecting pro- 
teins in gels: 

1. The total amount of each type of protein in a sample 
can be estimated with gel electrophoresis by staining 
the gels with a dye that binds approximately equally to 
all proteins. The intensities of the spots of dye indicate 
the comparative quantities of proteins of different 
lengths. 

2. Gel electrophoresis provides a way of detecting the 
synthesis of any particular protein without isolating it. 
If whole cells are briefly labeled with radioisotopes 



before they are analyzed, each newly synthesized chain 
can be detected in the gel by autoradiography. How- 
ever, because new proteins may be secreted from the 
cell or may be subject to different rates of metabolic 
turnover, the concentration of a labeled protein in a 
cell may not accurately reflect its rate of synthesis. 

In Vitro Protein Synthesis and Gel 
Electrophoresis Provide an Assay for 
Messenger RNA 

Two general approaches are used to determine what pro- 
teins a cell can make. In one method, the contents of 
whole, labeled cells are examined for newly synthesized 
proteins (see Figures 6-9, lanes 1 and 2, and 6-10). In the 
other, mRNA is extracted from the cells and translated in 
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A Figure 6-11 The translation of mRNA by mixtures of 
ribosomes, tRNAs, and protein synthesis factors extracted 
from reticulocytes. Here, the total protein produced by such 
reactions has been separated by electrophoresis and is visible 
by autoradiography because [ 35 S] methionine was added to 
the extract. Proteins synthesized from mRNAs in an un- 
treated reticulocyte extract are shown in lane 1; note the 
large amount of globin synthesis (G). In lane 2, a bacterial 
nuclease (from Micrococcus aureus) has greatly reduced the 
amount of synthesis. In lane 3, the nuclease has been chemi- 
cally inactivated,, and mRNAs from rat pituitary cells have 
been added. Several prominent pituitary-specific proteins are 
visible, including two hormone precursors: one of prolactin 
(pre-P n 236 amino acids long), and one of growth hormone 
(pre-GH, 212 amino acids long). [See H. R. B. Pelham and 
R. J. Jackson, 1976, Eur. J. Biochem. 67:247.] Photographs 
courtesy of D. Anderson. 

the presence of labeled amino acids by cell-free protein- 
synthesizing systems (Figure 6-11). Both approaches are 
actually assays for functional mRNAs. In either case, the 
products can be separated and identified by gel electro- 
phoresis. 

Different cell extracts can be used to label proteins in 
vitro (assay for active mRNAs). Bacterial cell extracts 
that can translate homopolymers were first widely used 
to break the genetic code and to examine bacterial and 
bacteriophage proteins; now extracts of eukaryotic cells 
are also commonly used. Two of the most popular cell- 
free systems are extracts of reticulocytes and of wheat 
germ, the embryo plant in a fertile wheat seed. Both are 
prepared by treating the cells first with a nuclease that 
destroys endogenous mRNA (mRNA from the source 
cells) and then with a chemical that blocks the nuclease so 
subsequently added mRNA is not destroyed. After this 



treatment, very little protein synthesis by endogenous 
mRNA occurs (see Figure 6-11, lane 2). Thus the added 
mRNA is responsible for almost all protein synthesis, and 
the products of the added mRNA can be easily detected. 

Examining the Sequences of 
Nucleic Acids and Proteins 

The first biopolymer to be sequenced was a protein, and 
this discovery has great historical importance. Before 
Fred Sanger reported the sequence of human insulin in 
1953, some biochemists were not convinced that proteins 
had specific sequences from end to end. The single unique 
sequence found in insulin implied a highly precise order- 
ing mechanism during protein synthesis. Since that time, 
the coding of protein sequence by nucleic acid sequence 
has been made clear. Recently, it has become much easier 
to obtain long nucleic acid sequences than long protein 
sequences and thus, with the aid of the genetic code, to 
deduce the sequence of many proteins rather than actu- 
ally determine them directly. 

Because the functions of nucleic acids and proteins de- 
pend on the linear sequences of their monomers, research 
in molecular biology relies heavily on techniques that re- 
veal and compare sequences. However, the sequence in- 
formation required in experiments varies considerably in 
extent and type. In the simplest case, only an estimate of 
the degree of similarity, or sequence relatedness, between 
two samples of nucleic acid or protein is required. Often, 
it is necessary simply to determine whether a particular 
sequence is present in a given mixture of nucleic acids or 
proteins. Once the presence of a certain sequence in a 
mixture of sequences is established, a variety of other 
questions arise. What is the concentration or amount of 
the specific sequence? Where within a DNA, RNA, or 
protein molecule is the sequence of interest located? And 
finally, what is the precise nucleotide or amino acid se- 
quence for the entire molecule? A variety of techniques 
are used to address these questions; each applies better to 
some questions than to others. 

Molecular Hybridization of Two Nucleic 
Acid Strands Can Be Detected in Several 
Ways 

Under the conditions of temperature and ion concentra- 
tion found in cells, DNA is maintained as a duplex (two- 
stranded) structure by the many hydrogen bonds of the 
A-T and G-C base pairs. The duplexes can be melted {de- 
natured into single strands) by heating them (usually in a 
dilute salt solution of, for example, 0.01M NaCl) or by 
raising the pH above 11. If the temperature is lowered 
and the ion concentration in the solution is raised, or if 
the pH is lowered, the single strands will anneal, or reas- 
sociate, to reconstitute duplexes (if their concentration in 
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solution is great enough). In a mixture of nucleic acids, 
only complementary strands reassociate; the extent of 
their reassociation is virtually unaffected by the presence 
of noncomplementary strands. Such molecular hybridiza- 
tion can take place between complementary strands of 
either DNA or RNA or between an RNA strand and a 
DNA strand (Figure 6-12). 

Visualization of Hybrids Electron microscopic ex- 
amination of molecular hybrids conveniently reveals the 
sequence relatedness of two nucleic acid samples. If two 
melted nucleic acid samples that are complementary over 
only part of their length are allowed to hybridize, a heter- 
oduplex results (Figure 6-13); complementary (duplex) 
and noncomplementary (single-stranded) regions can be 
distinguished in such preparations. This technique can be 
used not only to compare DNA strands but also to locate 
DNA sites complementary to RNA molecules. By this lat- 
ter procedure, it is possible to distinguish and locate the 
regions of DNA that are transcribed into RNA. Regions 
of RNA-DNA hybridization create loops (called R loops) 
in the nucleic acid molecules, where the RNA sequence 
has base-paired with one DNA strand and displaced the 
other DNA strand (Figure 6-14). The fact that 1 jLim of 
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E. coli DNA 




Heteroduplex loop 

▲ Figure 6-13 Electron micrograph (top) of a DNA heter- 
oduplex. DNA molecules on a carbon grid can be distin- 
guished as long threads when they are shadowed with heavy 
metals (here, platinum and palladium). This heteroduplex has 
formed from strands of two A bacteriophages incorporating 
different but related sequences of E. coli DNA (bottom). The 
A strands form a double-stranded hybrid where the inserted 
E. coli sequences are complementary (red); the dissimilar in- 
serted sequences remain unassociated, resulting in a hetero- 
duplex loop of single-stranded DNA (blue). From R. W. 
Davis and J. S. Parkinson, 1971, J. MoL Biol. S6-.403. 
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A Figure 6-12 Molecular hybridization: reassociation of 
the complementary strands of a nucleic acid. Under condi- 
tions of high pH or temperature, the duplexes in a solution 
of nucleic acids melt, or separate into single strands. With an 
appropriate change in conditions, complementary strands 
reassociate. The presence of noncomplementary chains does 
not affect the reassociation rate of complementary chains. 
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A Figure 6-14 If double-stranded DNA is treated with a 
50-percent solution of formamide at room temperature (or 
25°C), some hydrogen bonds between the strands of the mol- 
ecule break, weakening but not completely melting the du- 
plex. If RNA that is complementary to one strand of the 
duplex DNA is then introduced, the RNA binds to its com- 
plementary site on one DNA strand, displacing the other 
DNA strand. This occurs because an RNA-DNA duplex is 
more stable than a DNA-DNA duplex. The hybrid duplex 
and the displaced stretch of single-stranded DNA are called 
an R loop. The two R loops that appear in this electron mi- 
crograph result from the hybridization of 18S and 25S ribo- 
somal RNA from yeast with a region of bacteriophage A 
DNA that contains an inserted stretch of yeast ribosomal 
genes. [See M. Thomas, R. L. White, and R. W. Davis, 1976, 
Proc. Nat'l Acad. Sci. USA 73:2294.] Photograph courtesy 
of R. W. Davis. 
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(a) 




F ' 9 . Ure 6 * 15 , Autoradiograph showing in situ hybridiza- 
tion. A mouse liver section was exposed to labeled RNA 
complementary to glutamine synthetase mRNA. The label 
was allowed time to hybridize; then, unhybridized labeled 

was washed aw *y. (a) A light microscopic view of the 
autoradiograph showing cords of liver cells (hepatocytes) 
around a central vein. The barely visible dark grains around 
the central vein are in the first layer of hepatocytes. (b) The 
second view is a dark-field picture, which shows the grains 

^Tu^nu ^ gfeater COntrast - t See F - Ku ° « al., 
1988, Mol. Cell B,ol. 8:4966.] Photographs courtesy of 



double-stranded nucleic acid contains about 3000 bases 
can be used to estimate the number of nucleotides in 
single- and double-stranded regions, and thus an accurate 
map of the transcribed section of DNA. The technique 
was crucial in proving splicing of mRNA in eukaryotes. 

In Situ Hybridization Another use for molecular 
hybridization that has achieved great popularity is called 
m situ hybridization. Labeled RNA or DNA that is com- 
plementary to a specific mRNA is prepared. Cells or tis- 
sue slices are briefly exposed to heat or acid, which fixes 
the cell contents, including the mRNA, in place on a glass 
slide, the fixed cell or tissue is then exposed to the labeled 
complementary RNA for hybridization. Removal of un- 
hybridized labeled RNA and coating the slide with a pho- 
tographic emulsion is followed by autoradiography to 

S Ae . P. resence an d even the location of specific 
mRNA within individual cells (Figure 6-15). 
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^mI 9 ^"! 6 !. 5° filter - bindin S assay for RNA-DNA (or 
DNA-DNA) hybridization is an extremely popular and flexi- 
ble method of detecting complementary regions. Under the 
proper conditions of ion strength and temperature, filter- 
bound single-stranded DNA is exposed to a labeled RNA (or 

ci l S3m ? Mo,ecules or sequences complementary to the 
filter-bound DNA pair with it; unpaired labeled molecules 
can be removed. This technique allows as little as 1 part in 
10 of specific RNA or DNA to be detected 
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Hybrids on Filter-immobilized Nucleic Acid A 
common method of detecting hybrids between nucleic 
acid samples employs a single-stranded nucleic acid at- 
tached to a solid matrix. Nitrocellulose and treated nylon 
membranes are the most widely used matrices; it is not 
known why single-stranded DNA (or RNA) binds to 
these substrates, but this affinity is enormously useful. 
The radioactive RNA or DNA that is to be tested for 
sequences complementary to the bound nucleic acid is 
allowed to hybridize with it. After sufficient time, the 
unhybridized single strands unassociated with the bound 
nucleic acid are washed away. RNA hybridized to bound 
DNA is resistant to ribonucleases, whereas unpaired 
RNA regions are digested by these enzymes; thus any 
remaining single-stranded RNA regions are trimmed 
away in such experiments. The amount of hybrid formed 
can then be measured by the amount of bound radioac- 
tive label present (Figure 6-16). With the appropriate 
choice of filter-bound nucleic acid, one specific RNA (or 
DNA) sequence can be detected in a mixture of many 
different sequence types. 

In the procedure called DNA excess hybridization, the 
total RNA from cells (a very complex mixture of se- 
quences) is labeled and exposed to unlabeled purified spe- 
cific DNA. If the DNA is present in excess (if there are 
more than enough copies to hybridize with all comple- 
mentary segments of RNA), the amount of hybrid formed 
is proportional to the amount of RNA input. This allows 
an accurate measurement of the amount of the particular 
RNA in a mixed sample. 

It is also possible to test for the presence of a particular 
RNA sequence and to quantify the amount of it in differ- 
ent samples by competition hybridization. A measured 
sample of a specific labeled RNA is exposed to just 
enough complementary DNA to completely hybridize 
with it; a sample of unlabeled RNA is then added. If the 
unlabeled RNA sample contains the same sequence as the 
labeled RNA, they "compete" for the DNA; increasing 
the ratio of unlabeled to labeled samples decreases the 
amount of labeled RNA hybridized. The extent to which 
this takes place is a measure of the amount of competing 
RNA in the unlabeled sample. 

The Rate of Nucleic Acid Hybridization Can Be a 
Measure of Complexity The rate of hybridization 
between two complementary single-stranded nucleic 
acids in solution depends on the frequency with which 
complementary regions collide and nucleate, or start to 
form a duplex. This frequency, in turn, depends on the 
concentration of the two strands. If the DNA fragments 
of two different organisms — say, Escherichia coli and 
yeast — are incubated in amounts that yield the same total 
DNA concentration, the complexity of the DNA (the 
number of base pairs in the total genome) is about four 
times as great for yeast as for E. coli. A separated strand 
of £. coli DNA therefore encounters its correct partner 



four times as often as a strand of yeast DNA does, and 
£. coli DNA reassociates at a faster rate (Figure 6-17). 

From the equation for determining the quantitative re- 
lation between reassociation rate and genome complexity 
(given in Figure 6-17), the reassociation rate of any DNA 
sample can be used to calculate the relative complexity of 
the source genome. Experimentally, the initial concentra- 
tion of DNA C 0 and the time t are varied to measure the 
reassociation rate, so the resulting curves are often called 
Cot (cot) curves. 

If the DNA sequences of an organism are present once 
per haploid genome, the reassociation curve is uniform. If 
some sequences are repeated, these hybridize more rap- 
idly. Reassociation measurements have been important 
both in comparisons between different types of organisms 
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Any E. coli fragment 
(e.g., or E\) is four 
times more concen- 
trated than any yeast 
fragment (Y, or Y^); 
therefore the E. coli 
DNA sample re- 
associates four times 
as fast as the yeast 
DNA sample 



▲ Figure 6-17 The complexity of DNA controls the rate 
of its reassociation. The relative hybridization rates within 
two samples of dissolved and melted genomic DNA depend 
on their relative complexity (the number of DNA base pairs 
in the genome of each organism), provided the samples are 
equal in absolute concentration (total nucleotide concentra- 
tion). DNA is broken into pieces of 1000-2000 bases each, 
so size plays little role in the comparison. The equation for 
the reassociation rate is 



1 



1 + KC 0 t 



where C 0 and C t are the molar concentrations of single 
strands at times 0 and f, respectively, and K is the rate con- 
stant for the particular type of DNA (this constant depends 
on the complexity of the DNA). [See R. J. Britten and D. E. 
Kohne, 1968, Science 161:529.] 
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number 



Sequence 
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Spot 
number 



Sequence 



1 
3 
4 
5 
6 
8 
9 



AG 
CCCG 
AAG 
ACCG 
AAAG 
CG 
AUG 



10 

11 

12 

13 

14b 

14c 

15 



UAG 
UCG 
CCUG 
CCUACG 
AAUACCG 
AUCCAG 
CCACACCACCUG 



16 

17 

18 

20 

21 

22 

G 



UUAG 
UCUG 
AUCUCG 
CCG 
CG 
PG 
Gp 




\ 14c 



A Figure 6-18 A nbonuclease Tl fingerprint of SS ribosomal RNA from oocytes of the frog 
Xewopw laevis. This enzyme cuts RNA on the 3' side of all guanylate residues, 
NpGp I Np(Np) n Gp 1 Np producing fragments that all contain one Gp (guanylate) at their 3' 
ends. The digest is applied to treated paper (cellulose acetate), and a two-step separation is car- 
ried out: electrophoresis in one dimension (arrow 1), followed by chromatography in the other 
(arrow 2) If the starting sample is radioactive ( 32 P-labeled RNA is often used), the oligonucleo- 
tides can be identified by autoradiography. Spots of RNA can be cut out of the paper sheet and 
further analyzed biochemically to determine their sequences. No spots are numbered 2 7 14a 
or 19; these numbers were given to oligonucleotides identified in another type of 5S rRNA From 
D. D. Brown, D. Carroll, and R. D. Brown, 1977, Cell 12:1045. Copyright MIT. 
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(prokaryotic versus eukaryotic; vertebrate versus inverte- 
brate; and so on) and in studies of the degree of repetition 
of certain sequences within eukaryotic genomes. (Repeti- 
tious DNA sequences are discussed in detail in 
Chapter 10.) In a variation of the use of reassociation 
curves, a trace amount of radioactive pure DNA is added 
to unlabeled RNA from a cell of interest. Because the 
DNA is present in a tiny amount, the rate of RNA-DNA 
hybridization depends on the^concentration of comple- 
mentary RNA. From that rate, the amount of comple- 
mentary RNA in a sample can be estimated. Data curves 
from such measurements are referred to as R 0 t (rot, or 
RNA concentration) curves. 

Fingerprinting (Partial Sequence Analysis) 
Allows Quick Comparisons of 
Macromolecules 

The enzymatic fragmentation of proteins and nucleic 
acids at specific sites provides a means of recognizing par- 
ticular macromolecules quickly. As we saw in Chapter 2, 
the enzyme trypsin digests protein chains, cleaving them 
on the carboxyl side of each lysine and each arginine resi- 
due to produce a specific set of peptides from any given 
pure protein. Likewise, the enzyme ribonuclease Tl cuts 
RNA at the 3 ' side of each guanylate residue to produce 
specific fragments ending with a guanylate. The resulting 
oligonucleotides are fairly short: they normally contain 
2-20 nucleotides, because consecutive guanylates are 
usually no more than 20 bases apart. Reliable separation 
and detection of different peptides from a pure protein or 
different oligonucleotides from a pure RNA sample can 



1- f 




be accomplished by electrophoresis, chromatography, or 
both. 

Because the oligonucleotides or peptides produced by a 
given enzyme from a pure RNA sample or protein are 
always the same, the pattern of separated fragments is 
always the same. The characteristic pattern of fragments 
from a primary sequence is called a fingerprint (Figures 
6-18 and 6-19). Fingerprinting, or partial sequence analy- 
sis, allows the rapid comparison of two samples of RNA 
or protein when there is no need to determine the com- 
plete sequence of nucleotides or amino acids. The finger- 
printing technique was first developed for proteins, which 
can be cleaved by both enzymes and chemical reactions. 
With this historic fingerprints of globin shown in 
Figure 6-19, Vernon Ingram demonstrated that people 
suffering from sickle-cell anemia, a genetic disease, have a 
valine substituted in one position in place of a glutamic 
acid in their j8 globin. This was the first mutant protein 
shown to be affected in function by a change in one 
amino acid residue. ~i 



Restriction Enzymes Allow the Precise 
Mapping of Specific Sites in DNA 

The most flexible, simple, and useful technique for partial 
sequence analysis of DNA was made possible by the dis- 
covery of bacterial restriction endonucleases, which rec- 
ognize specific short oligonucleotides from four to eight 
residues long in DNA and then cleave the DNA at each 
site (Figure 6-20a). The word "restriction" refers to the 
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A. Figure 6-19 Fingerprints of (a) normal and (b) sickle-cell 
human /3-globin. Proteolytic enzymes such as trypsin are used 
to break the peptide chain at known amino acid residues 
(trypsin cuts after each arginine and each lysine). The result- 
ing set of specific fragments can then be separated by electro- 
phoresis followed by chromatography. Individual peptide 
spots can be distinguished by spraying the chromatography 
paper with ninhydrin, a reagent that forms a purple product 
with free amino groups. (Spots 23, 24, and 26 show up 
poorly with ninhydrin.) 

These fingerprints are identical, with one exception. Pep- 
tide 4 of the ft chains of hemoglobin S (the hemoglobin of 
people with the sickle-cell disease) is found in a slightly dif- 
ferent location than peptide 4 of normal hemoglobin A. 
Analysis of the two peptides has shown that hemoglobin S 
has a valine instead of a glutamic acid at residue 6 in the 
)3 chain; thus a single amino acid replacement is the cause of 
sickle-cell anemia. This represented the first demonstration J> 
that a random mutation in nature resulted in a single amino 
acid substitution. From V. Ingram, 1958, Biochim. Biophys. 
Acta 28:543, 



▼ Figure 6-20 (a) EcoRI and many other restriction endo- 
nucleases cleave DNA so that the fragments have short com- 
plementary single-stranded segments at the ends. These 
"sticky ends" are important in recombinant DNA techniques 
because they readily pair with the ends of other cleavage 
fragments produced by the same restriction endonuclease. 
EcoKl recognizes the sequence shown here, (b) Most cells 
with restriction endonucleases also have corresponding modi- 
fication endonucleases, EcoKl methylase, a modification en- 
donuclease, catalyzes the methylation of two adenylates 
(shown in blue) in the recognition sequence; this prevents 
cleavage by EcoKl. Thus a cell making EcoKL endonuclease 
and methylase does not destroy its own DNA. 
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Table 6-4 Examples of the actions of restriction endonucleases 



Source 

microorganism 



Enzyme* 



Recognition 
site ( i ) f 



Arthrobacter luteus 
Thermus aquaticus 
Haemophilus parahaemolyticus 
Haemophilus gallinarum 
Escherichia coli 
Haemophilus influenzae 
Nocardia otitiscaviaruns 
Streptomyces fimbriatus 



A 

(50) 



Ad2 

(36) 



Number of cuts (kb)* 

SV40 
(5.2) 



Alu\ 

Taql 

Hphl 

Hgal 

EcoRI 

HwdHI 

Notl 

Sfil 



pBR322 
(4.3) 



AG i CT 

T jCGA 

GGTGA+5 

GACGC+8 

G I AATTC 

A | AGCTT 

GC j GGCCGC 

GGCCN 4 i NGGCC 



143 
121 
168 
102 
5 
6 
0 
0 



158 
50 
99 
87 
5 
12 
7 
3 



34 
1 
4 
0 
1 
6 
0 
1 



14 
13 
18 
12 
1 
1 
0 
0 



HindlU cut sites 



Recognmon sequences are wntten 5' 3' (only one strand is given). For example, G i GATCC is anTb^SonTr 

1 

(5')GGATCC(3') 
(3')CCTAGG(5') 

S«frnlf m Site i ht u Phl T* H8 A ° CCUrS fivC ° r ei 5 ht bases awa y from the recognition sequence; N indicates any base 

of the site recognized ^T^ITl tnS^he s^ce" rand ° m M ^ * *~ by ^ wh " e « is * e ,e ^ h 

source: R. J. Roberts, 1988, N«c. Acids Res. 16(supp):r271. 

function of these enzymes in the bacteria of origin: a re- 
striction endonuclease destroys (restricts) incoming for- 
eign DNA (for example, bacteriophage DNA or DNA 
accidentally taken up during transformation) by cleaving 
it at these specific sites, called restriction sites. 

Another enzyme protects a bacterium's own DNA 
from cleavage by modifying it at or near each potential 
cleavage site: a methylase adds a methyl group to one or 
two bases, usually within the restriction site. When a 
methyl group is present there, the restriction endonucle- 
ase is prevented from cutting the DNA (Figure 6-20b). 

Together with the restriction endonuclease, the meth- 

ylating enzyme forms a restriction-modification system 

that protects the host DNA while it destroys foreign 

DNA. 5 
A restriction endonuclease cuts a pure DNA sample 

into a consistently reproducible set of fragments that can 

be easily separated by gel electrophoresis (Figure 6-21). 

Several hundred restriction enzymes with different recog- 
nition sites are now available (see Table 6-4). If the order 

of nucleotides in DNA were random, the number of cuts 

expected would be larger for an enzyme that requires 

only a four-base site than for one that requires a longer 

site and larger for longer DNAs than for shorter ones. 

However, the sites for restriction endonucleases are not 

randomly distributed; by testing a series of enzymes that 

cut at different sites it is possible to cut a particular DNA 

many or only a few times. The most recently discovered 

eight-base cutters have proved to be especially useful for 

producing very large fragments, which then can be sepa- 
rated by pulse-field gel electrophoresis (see Figures 5-9 




A = 1768 bp 

B = 1169 bp 
C = 1101 bp 



D = 526 bp 
E = 447 bp 



F = 215 bp 



A Figure 6-21 The DNA from SV40 virus can be purified 
and digested with the restriction endonuclease H/ndlll {from 
Haemophilus influenzae). The digest is then subjected to elec- 
trophoresis in a gel containing ethidium bromide, a molecule 
that binds to DNA and fluoresces when exposed to ultravio- 
let irradiation. Lane 1 represents the uncut DNA; lane 2, the 
digested DNA. HindUl cuts the SV40 molecule at six sites 
( I ) s producing six fragments. By convention, the pieces of 
DNA released by a restriction endonuclease are labeled A-Z 
in order of decreasing size; the HmdIII fragments of SV40 are 
therefore labeled A-F. Photograph courtesy of D. Nathans. 
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and 6-8). Fragments of 1-10 megabases (10 6 -10 7 bp) 
are used to map the chromosomes of very large genomes, 
such as those of mouse and man. 

Digestion of DNA by restriction endonucleases, fol- 
lowed by simple electrophoretic separation of the frag- 
ments, has revolutionized chromosome mapping. The use 
of two or more restriction endonucleases on a pure DNA 
sample can show the order of the restriction sites in a 
DNA sample (Figure 6-22). Also, many sites can be lo- 
cated by partial digestion of terminally labeled DNA with 
only one enzyme (Figure 6-23). In these ways, it is possi- 
ble to produce a map showing the order of the restriction 
sites in any region of DNA. An important application of 
restriction endonucleases is their use to cut off one end of 
a DNA sample that has been end-labeled so that DNA 
pieces labeled at the other end are available for further 
study (see Figure 6-23). 

Southern DNA Blots The ability to divide DNA into 
reproducible pieces allows the restriction sites around a 
particular sequence of interest to be mapped. This possi- 
bility is realized in the laboratory by determining which 
restriction fragments hybridize to a specific labeled probe 
sequence technique called the Southern blot (after its 
originator, Edward Southern). DNA restriction frag- 
ments from a sample are separated by gel electrophoresis; 
their distribution in the gel is preserved as they are dena- 
tured and transferred by blotting to a solid substrate with 
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▲ Figure 6-23 Mapping the multiple recognition sites of a 
restriction enzyme by partial digestion. DNA is labeled at its 
termini with 32 P, and fragments with one labeled terminus 
can be obtained by cutting off one end with an appropriate 
enzyme. The mapping procedure is applied to the remaining 
piece with a second enzyme. Complete digestion would pro- 
duce only one labeled fragment (here, the 0.2-unit piece), but 
brief, partial digestion (in which the enzyme cuts each long 
piece only once, at most) produces a labeled fragment for 
each restriction site. From the lengths of the labeled pieces, 
the positions of enzyme II restriction sites can be inferred. 
[See H. O. Smith and M. Birnstiel, 1976, Nwc. Acids Res. 
3:2387.] 



01 0.5. 0.4 either 0.6 0.3 0.1 
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A Figure 6-22 Mapping the cleavage sites of two restric- 
tion enzymes with respect to one another, (a) When a given 
piece of DNA is exposed separately to two restriction en- 
zymes (I and II), each cuts the DNA once. The lengths of the 
fragments are determined by gel electrophoresis, (b) Diges- 
tion with both enzymes is used to determine the relative po- 
sitions of the cuts along the DNA. The fragment lengths 
identify the positions of the restriction sites for enzymes I 
and II with respect to the ends of the DNA and therefore 
with respect to each other. By continuing this process with 
different pairs of enzymes, the investigator can construct a 
detailed map of restriction sites. 



a charged surface (usually a nitrocellulose filter). The fil- 
ter is then exposed to a specific radioactive nucleic acid 
sequence (the probe). The blotted DNA fragments that 
are complementary to the probe hybridize with them, and 
their location on the filter can be revealed by autoradiog- 
raphy (Figure 6-24). This technique is so sensitive that a 
DNA sequence that appears only once in the human ge- 
nome (about 1 part in 10 6 ) can be detected in as little as 
5 /tg of DNA (the DNA content of about 10 6 cells). 

This test is widely used in genetic studies of humans, 
who do not, as a rule, breed within families. Conse- 
quently, the human population shows many genetic dif- 
ferences, or genetic polymorphisms. These variations are 



210 CHAPTER 6 



MANIPULATING MACROMOLECULES 



DNA 



Gel 

electrophoresis 



Cleave with 
restriction enzymes 



Filter 
paper 



1 s 








r 1 




















-4 



2|| 



Capillary action transfers 
DNA from gel to nitrocellulose 



Nitrocellulose 




Autoradiography 



Hybridize with 
labeled DNA or 
RNA probe 



A F" i 

^K^^SSTlSrSfSE^ dettcda « * e I«~ * *ed& DNA sequences 




UN 



48 h 



96 h 




5 kb — 




' 

i: >;...<* 



2 kb-+ 




r • 



t ... -i ff E i 

v. vvr-'"- 



" >* E V 





0-globin 
mRNA 



A Figure 6-25 The Northern blot technique for detecting 
the presence of specific mRNA molecules. AutoradSS 

andT d P0S,t,0 , n ? ^ ^P^ary ™RNA in 

and the density of the spot shows the amount of it The oho 

JJ-globm mRNA ,n erythroleukemia cells at three different 

ST 7 " f 6 gf0Wing 3nd have not ""edTmake 

CL ( e an bee^'„d° r "T^^ and « -d 96 h aftef 
cney have been induced to stop growing and begin differ*™ 
atmg^The 0-globin mRNA is barely detectable % J^ST 
ce s but mcreases by a factor of more than 1000 inTh^ 
differentmtion. Photograph courtesy of L Kole 



often indicated by the presence or absence of particular 

KM^*"- cal,ed — ~ 

^o/Jem (RNA) and Western (Protein) Blots The 

S^hern £? T'^ ^ * to P a "-ed alter* 
mR^l ' , f ° detCCt the P fesence of specific 

^edh S - ^ m ° ,eCU,eS in 3 ™Me are 
dehvS t Z aa ^ b ? a with an W such as formal- 
V reVCnt h L ydr0gen b ° nds between base pairs 

rorm. ihe RNA sample (often the total RNA from cells) 
«• then separated according to size by gel electrXre s 

lose" fil e??o e wh b1 ?' £ " ""^ » 

adhere The J "J"** denatured RNA will 

adhere. The filter is exposed to a labeled DNA probe and 

ubje^ed to autoradiography. The Northern C indt 

SP c fk m RN°A Unt 35 WeU , aS ** Presence and ™ of a 
specific mRNA ,n a sample and the procedure is widely 

used to compare the amounts of a specific mRN^ Zldh 
under different conditions (Figure 6-25). 

Another bit of laboratory jargon that has become a 

V h Wot , In 1,115 Procedure, a one- or two-dimen- 
sional electrophoretic separation of proteins is carrfeZt 

lute e ^ r ° tem ,S or b,otte d, to nitrocel- 

acove antibodies against a particular protein; autoradiog- 
raphy reveals the presence of that protein. 8 

Band Analysis of SI Digests An important method 
for measuring the length of complementa^ sequences in 
two nucletc acds employs the endonuclease SI, an en- 
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▲ Figure 6-26 The SI mapping technique determines the 
lengths of complementary sequences in two nucleic acid sam- 
ples. A portion of the map of adenovirus DNA is shown. 
Earlier hybridization experiments established that an mRNA 
was complementary to a sequence in the large region 
spanned here by restriction fragment A. Restriction fragments 
A, B, C, and J were prepared from 32 P-labeled DNA, dena- 
tured and hybridized with a large excess of mRNA prepared 
from virus-infected cells. The mixture was treated with SI to 



destroy any unpaired DNA that had not found an mRNA 
partner; the protected labeled DNA fragments (red) were 
then separated by gel electrophoresis. The autoradiograph of 
the gel shows the lengths of the segments protected by (com- 
plementary to) the mRNA. Fragments A and B were pro- 
tected for 1.7 kb; C and J were protected for 1.0 and 0.7 kb. 
Thus the mRNA includes 1.7 kb positioned as indicated. 
From A. /. Berk and P. A. Sharp, 1978, Cell 14:695. Copy- 
right M./.T. 



zyme from the mold Aspergillus oryzae that destroys un- 
paired RNA or DNA but not double-stranded molecules 
(Figure 6-26). Either the RNA or the DNA in a hybrid 
may be labeled. For example, the total unlabeled mRNA 
from a cell can be exposed to a labeled DNA probe (usu- 
ally consisting of one or more restriction fragments) that 
may include all or part of the region of DNA that is tran- 
scribed to produce one particular mRNA. The labeled 
RNA-DNA hybrid is then digested with SI to remove 
unpaired nucleic acid strands, leaving hybrid duplexes 
intact. After electrophoresis, different hybrids form dis- 
crete bands, whose positions can be used to estimate the 
lengths of the hybrids. This technique is widely used to 
determine how much of a particular DNA restriction 
fragment is complementary to an mRNA region. 

Finding the Start Site of an mRNA It is often very 
important to find the point in a DNA sequence at which 
transcription of a particular mRNA begins. Two methods 
are used: the endonuclease SI or the primer extension 
technique. First, a general region of a DNA molecule that 
includes the start site must have been located (DNA se- 



quencing is described in the next section). Appropriate 
restriction sites can be chosen in this sequence to prepare 
a piece of end-labeled DNA approximately 100 nucleo- 
tides long that will hybridize with the 5' portion of the 
mRNA. Figure 6-27 shows how such a piece of labeled 
DNA, trimmed with SI endonuclease, can be used to lo- 
cate the exact start site. The same logic applies with 
primer extension. A DNA oligonucleotide about 20 bases 
long is chosen to find a specific complementary site on the 
mRNA. This primer can be extended enzymatically to the 
beginning of the mRNA; the length of this extension 
product can then be determined accurately by gel electro- 
phoresis. 

Endonucleases Compared to Exonucleases Thus 
far, we have been discussing endonucleases, enzymes that 
cut DNA or RNA (or both) within a chain. Restriction 
endonucleases have a restricted cutting specificity; SI and 
the widely used pancreatic RNase and DNase digest al- 
most all internucleotide bonds equally. The SI endonu- 
clease is single-stranded; the pancreatic enzymes can be 
single- or double-stranded. 
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A Figure 6-27 The site in DNA that encodes the first nu- 
cleotide ,n an mRNA molecule can be found by using primer 
extension or SI endonuclease (see Figure 6-26). (a) In the 
primer extension technique, a short (approximately 10 nucle- 
? l , \ .'f deox y nucl «>tide (blue) is prepared and end- 
abeled. After it is hybridized to the mRNA (red), it is 
lengthened by the enzyme reverse transcriptase until it 
reaches the first nucleotide of the mRNA. (b) The use of SI 
to map start sites begins with the preparation of a uniquely 
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end-labeled short (approximately 100 nucleotides), single 

strand > that encodes the general region of the mRNA 
start site and whose total sequence is known. This is hybrid- 
ized w.th the mRNA, and unpaired nucleic acid is then di- 
gested with SI endonuclease. Denaturation leaves a labeled 

P ,ece , whose le "gth accurately marks the distance of the 
startmg nucleotide of the mRNA from the nucleotide that 
hybridized with the labeled DNA end. 



_ _ * m 

s remove nucleotides 
one at a time from the ends of entire RNA or DNA 
strands. Some act only on single strands; others remove 
nucleotides from either the 5' or the 3' ends (but not 
both) of duplex DNA. There are too many of these en- 

ETxf a . k COm P rehensive description of them 

here. We shall describe exonucleases as necessary in this 
and other chapters. 

The Sequence of Nucleotides in Long 
Stretches of DNA Can Be Rapidly 
Determined 

The discovery of restriction endonucleases was an impor- 
tant step that led to general methods for determining the 
exact nucleotide sequences in long stretches of DNA. An 

* u 9 j ^ 6 ;? 8 ° NA set J uenci ng by the Maxam-Gilbert 
method. A 5 '-end-labeled DNA fragment is prepared for 
sequencing. Four identical samples of this fragment are sub- 
jected to four different chemical reactions. Each breaks the 
fragment only (or mainly) at the A, G, C + T, or C residues 
respectively. The reactions are controlled, so that each la- ' 
be ed chain ,s hkely to be broken only once. The resulting la- 
Med subfragments created by all four reactions have the 
label at one end and the chemical cleavage point at the 
other. Gel elearophoresis and autoradiography of each sepa- 
rate mixture yield one radioactive band for each nucleotide 
n the original fragment. Bands appearing in the A and G 
lanes can be read directly. Bands in the C + T and C lanes 
are read as Cs; those in the C + T lane alone, as Ts. [See 
A. Kfaaun and W. Gilbert, 1977, Proc. Nafl Acad. L. USA 
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▲ Figure 6-29 DNA sequencing by the Sanger (dideoxy) method, (a) A single strand of DNA to be 
sequenced is hybridized to a 5 '-end— labeled deoxynucleotide primer; four separate reaction mixtures 
are prepared in which the primer is elongated by a DNA polymerase. Each mixture contains the four 
normal deoxynucleoside triphosphates plus one of the four dideoxy nucleoside triphosphates in a ratio 
such that about 1 in every 200 residues is a dideoxynucleotide. Since a dideoxynucleotide has no 3' 
hydroxyl, no further chain elongation is possible when such a residue is added to the chain. Thus, 
each reaction mixture will produce prematurely terminated chains ending at every occurrence of the 
dideoxynucleotide. Each mixture is then separated on a sequencing gel as in Figure 6-27. (b) An ac- 
tual radioautogram in which over 300 bases can be read. (The enzyme used in the figure is Se- 
quenase™, a commercial preparation of a bacteriophage-encoded polymerase.) Each reaction was car- 
ried out in duplicate, which aids in checking the sequence. Courtesy of United States Biochemical 
Corporation. 
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earlier and very important advance was the recognition 
that careful gel electrophoretic procedures can separate 
every single DNA fragment in a series up to 500 bases 
long. Two highly successful procedures for DNA se- 
quencing are in wide use. Both depend on the separation 
of a set of fragments that differ in length by only one 
nucleotide (Figure 6-28). The first, invented by Allan 
Maxam and Walter Gilbert, chemically cleaves an end- 
labeled DNA sample; the second, developed by Fred San- 
ger and his colleagues, uses enzymatic synthesis to extend 
a short sequence of end-labeled DNA (Figure 6-29). 

Modern DNA sequencing is fairly simple and accurate 
over long regions; already, the total genomes of many 
viruses and almost all of the E. coli genome have been 
sequenced. Automation of the techniques for sequencing 
large pieces of DNA (see Figure 6-8) should permit se- 
quencing of the entire human genome in 10—15 years, if 
not before. 

Proteins Can Be Sequenced Automatically 

From a DNA sequence and the genetic code, it is possible 
to deduce the sequences of the encoded protein. And with 
the aid of computers to locate "open reading frames," 



CAGTCAGT 

i.e., codon stretches without protein termination signals, 
investigators often do just that. Nevertheless, the ability 
to sequence protein chains directly remains a crucially 
important and necessary tool of molecular biology. To 
cite one application, the genome of a higher animal may 
contain a number of genes that are similar but not identi- 
cal in sequence; only by knowing the protein sequence of 
a product of such related regions can the observer know 
which DNA sequence is responsible for encoding a partic- 
ular protein. Even more importantly, perhaps, proteins of 
interest are most often isolated before their genes, so ob- 
taining at least a partial amino acid sequence is a critical 
first step in studying many proteins. The most popular 
direct protein-sequencing technique in use today is the 
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A Figure 6-30 Sequencing a protein by the Edman degra- 
dation procedure. The peptide is treated with phenyliso- 
thiocyanate, which combines with the amino-terminal residue 
in the peptide chain, rendering the first peptide bond in the 
chain labile to treatment with mild acid. The same pair of 
reactions is carried out repeatedly to remove the amino acids 
one at a time. After each step, the removed amino acid is 
chemically identified. In this way, the entire amino acid se- 
quence of a short peptide can be determined. 



Table 6-5 Terms used in recombinant DNA research 



Genomic DNA 



All DNA sequences of an 
organism 



( com P lementarv DNA copied from an mRNA 
DN A) molecule 



Plasmid 



Vector 



Host cell 



Genomic clone 



cDNA clone 



Library 



A small, circular, 
extrachromosomal DNA 
molecule capable of reproducing 
independently in a host cell 

A plasmid or a viral DNA 
molecule into which either a 
cDNA sequence or a genomic 
DNA sequence is inserted 

A cell (usually a bacterium) in 
which a vector can be 
propagated 

A selected host cell with a vector 
containing a fragment of genomic 
DNA from a different organism 

A selected host cell with a vector 
containing a cDNA molecule 
from another organism 

A complete set of genomic clones 
from an organism or of cDNA 
clones from one cell type 



Edman degradation procedure, in which amino acid resi- 
dues are cleaved from a protein one by one; after each 
cleavage, the released amino acid is identified (Figure 6- 
30). Machines called sequenators can perform this reac- 
tion on tiny amounts of a pure protein; obtaining an ac- 
curate sequence of 50 amino acids is not exceptional. 



Recombinant DNA: Selection 
and Production of Specific 
DNA 

The essence of cell chemistry is to purify sufficient quanti- 
ties of a particular substance to permit its chemical be- 
havior to be analyzed. Segments of pure samples of iden- 
tical, relatively short DNA molecules from viruses or 
plasmids can be isolated directly and subdivided into 
smaller pieces with the use of restriction endohucleases. 
But the human genome, for example, contains about 3 x 
10 bp, so that cutting roughly at every 3000th base pair 
would produce a million fragments that could not be sep- 



arated from each other. This obstacle to obtaining pure 
DNA samples from large genomes has been overcome by 
recombinant DNA technology. 

Two widely used types of recombinant DNA prepara- 
tions— genomic clones and cDNA clones— are made. A 
genomic clone contains a fragment of genomic DNA; a 
cDNA clone contains a molecule of complementary DNA 
copied from mRNA by enzymes (Table 6-5). In both, the 
DNA of interest is linked to a vector— most often a bacte- 
riophage or a plasmid that can reproduce independently 
within a bacterial host. (The most widely used host- 
vector systems are E. coli as host with either a plasmid or 
bacteriophage A as the vector.) Recently yeast artificial 
chromosomes (YACs) have been prepared that can be 
used as vectors in yeast cells for very large genomic frag- 
ments. A library consisting of a full set of genomic or 
cDNA clones can be prepared from the total DNA of an 
organism or cell type or from the set of cDNA molecules 
copied from all mRNAs in a cell (Figure 6-31). 

The preparation and selection of cDNA and genomic 
clones are illustrated in the following section by a descrip- 
tion of how recombinant DNA containing mouse globin 
sequences can be obtained. 
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▲ Figure 6-31 A comparison of genomic cloning (a) with 
cDNA cloning (b). In genomic cloning, the genomic DNA 
must be cleaved with restriction endonucleases before it can 



Library of cDNA clones 

be inserted into vectors; in cDNA cloning, the mRNAs must 
first be copied into double-stranded DNA molecules. 



cDNA Clones Are Whole or Partial 
Copies of mRNA 

To prepare cDNA clones with globin-encoding sequences 
(Figure 6-32), the starting material is reticulocytes, eryth- 
rocyte (red blood cell) precursors. Over 90 percent of the 
proteins synthesized by these cells are a- and /J-globins 
and therefore they are rich sources of globin mRNA. 

The enzyme reverse transcriptase (found in retrovi- 
ruses; see Figure 5-39) is used to make cDNA clones of 
the reticulocyte mRNAs. Like the DNA polymerases in 
cells, this enzyme can build a complementary nucleic acid 
strand on a template, but only by adding nucleotides to a 
primer. Thus, before the reverse transcriptase can do its 
work, a short primer strand must be hybridized to the 
nucleotides near the 3 ' ends of the mRNAs. Fortunately, 
a single oligonucleotide primer — a string of thymidylate 
residues (poly T) — serves for most eukaryotic mRNAs, 
which end in a string of 50—250 adenylate residues 
(poly A). 

After the cDNA copy of the mRNA has been made, the 



mRNA is removed by an alkali treatment that destroys 
RNA but does not affect DNA and a duplex DNA is 
made from the cDNA strand. In one technique, the 3' end 
of each cDNA strand is elongated by adding several resi- 
dues of a single nucleotide (say, poly C) through the 
action of a terminal transferase, an enzyme that adds 
bases at free 3' ends. A poly G primer is hybridized with 
the terminal poly C and this G primer is then elongated 
by a DNA polymerase. What results is a complete double- 
stranded DNA copy of the original mRNA. 

The next step is to insert the now double-stranded 
DNA into a plasmid. Plasmids, which occur naturally in 
almost all bacteria, were originally detected by their abil- 
ity to transfer genes between bacteria (Chapter 5). It has 
been shown that a specific region of the plasmid circle, 
the replication origin, must be present to assure replica- 
tion of the plasmid in a host batterium. 

The plasmid DNA is cleaved once with a restriction 
enzyme at a point that leaves the replication origin undis- 
turbed. The double-stranded copy of the mRNA-globin is 
then inserted at the cut site and the circle is rejoined. The 
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A Figure 6-32 Preparation of a cDNA clone with globin encoding 
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first technique, still widely used, for carrying out this in- 
sertion is called homopolymeric tailing. A homopolymer 
(say, poly C) is added to the two 3' ends of the double- 
stranded cDNA-globin, and a complementary homopoly- 
mer (poly G) is added to the 3' ends of the cut plasmid. 
When the "tailed" plasmid and DNA-globin are mixed, 
their complementary single-stranded tails spontaneously 
hybridize; the resulting circular recombinant molecule 
can be resealed with the enzyme DNA ligase (Chapters 3 
and 12). Specially treated £. coli cells take up the plas- 
mid, and the recombinant molecule multiplies along with 
the cells. 

If the chosen plasmid contains a gene for resistance to 
an antibiotic, the cells that take up the plasmid will grow 
and multiply in the presence of the antibiotic but the 
other cells will not. If, at the outset, the number of plas- 
mids allowed to infect the E. coli cells is one-tenth or less 
of the total number of £. coli cells, it is very unlikely that 
more than one plasmid will end up in a recipient cell. As a 
rule, then, the recombinant DNA in all cells of a colony 
grown from a single cell will have descended from a single 
recombinant DNA molecule. In the case described here, 
90 percent of the recombinant plasmids would encode a- 
or j8-globin. Because mRNA molecules are often not com- 
pletely copied, partial sequences also may be cloned. To 
verify exactly what the plasmid vector contains, the re- 
combinant molecule can be sequenced. 

Complementary DNA clones can be prepared from the 
unpurified mRNA from any cell type, but this produces a 
random mixture of individual recombinant clones that 
must be screened to isolate specific ones (see Figure 6-34). 
It is also possible to use an antibody that reacts with a 
protein to detect whether an E. coli colony (or a plaque if 
the vector is a bacteriophage) contain the protein encoded 
by the cloned cDNA. 

Genomic Clones Are Copies of DNA from 
Chromosomes 

The most common procedure for preparing and selecting 
specific clones from genomic DNA — for example, the 
total collection of DNA in mouse chromosomes — makes 
use of A bacteriophage. The DNA of the phage is about 
50 kb long, but a center section about 25 kb long can be 
removed and replaced with foreign DNA without impair- 
ing the ability of the phage to infect and reproduce in 
most E. coli cells. A genomic library is a collection of 
recombinant molecules, maintained either in phage parti- 
cles or in plasmids growing in bacteria, that includes all 
DNA sequences of a given species. Once it is prepared, 
the library can be screened for the phage or plasmid that 
contains the DNA sequence of interest. 

The size of a library depends on the amount of DNA in 
the organism's haploid genome. For example, the human 
and mouse genomes are between 3 and 4 x 10 9 bp long. 



If one of these genomes were divided into fragments 
about 20 kb long for insertion into bacteriophage A, then 
2 X 10 5 different recombinant bacteriophage A particles 
would be required to constitute a complete library. Be- 
cause the pieces of DNA are incorporated into phages by 
chance, about 10 6 recombinant phages are necessary to 
assure that each DNA piece has a 90-95 percent chance 
of being included. 

The first step in preparing a genomic library is to ex- 
tract all the organism's DNA from some cell types (Fig- 
ure 6-33). Sperm cells or early embryos are often used. 
The DNA is then broken into fragments by a restriction 
endonuclease, such as EcoRI, which cleaves the DNA in a 
way that produces short, single-stranded, "sticky" ends 
(AATT) on every fragment (see Figure 6-20). Digestion is 
stopped when the average size of a fragment is approxi- 
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A Figure 6-33 The construction of a genomic library of 
mouse DNA in bacteriophage A. The total DNA from mouse 
cells (both sperm cells and embryonic tissue cells presumably 
have a complete set of sequences) is often used. A single re- 
gion of the mouse genome, such as the one that encodes /3- 
globin, would occur approximately once in 10 5 particles. 
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► Figure 6-34 Selection of a specific genomic clone from a 
bacteriophage A library. Although about 2 x 10 5 phages 
could contain all mouse sequences, 2 x 10 6 phages are plated 
to ensure that a phage with the desired sequence is included. 
This requires an area of 1000-2000 cm 2 to accommodate all 
the phage plaques. (In the initial plating, the plaques are not 
allowed to develop to a visible size. The plating can be re- 
peated with fewer phages to obtain pure isolates.) The posi- 
tion of the spot on the autoradiograph identifies the desired 
plaque on the plate. Phage particles from that plaque can 
then be selected. 
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mately 20 kb. The bacteriophage A DNA also can be cut 
at two restriction sites by EcoRI to yield a center section 
approximately 25 kb long plus two shorter flanking ends, 
or arms. The center section of the phage DNA can then be 
separated from the two arms. 

The A arms and the collection of genomic DNA frag- 
ments are mixed in about equal amounts (approximately 
10 6 DNA fragments and a similar number of pairs of A 
arms). Because the sticky ends are complementary, mole- 
cules approximately the same length as normal phage 
DNA will form, but they will include a piece of mouse 
DNA about 20 kb in length. DNA ligase, the enzyme that 
normally joins DNA breaks, is used to seal the recombi- 
nant DNA molecules, which are then coated with bacteri- 
ophage proteins prepared from infected £. colt cells. Only 
DNA molecules of the correct size will be effectively 
coated, or packaged, and give rise to fully infectious A- 
bacteriophages, which can be grown on a lawn of E. colu 
Bacteriophages containing DNA sequences that code for 
any specific sequence (for example, globin) can be de- 
tected by hybridization of cDNA-globin sequences (pre- 
pared as described in Figure 6-32) with DNA obtained 
from each plaque (Figure 6-34). 
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Vectors for Recombining DNA Exist in 
Many Cell Types 

Any gene that can be subjected to a hybridization assay 
can be purified. Once a bacteriophage or other bacterial 
vector containing the desired gene is prepared, an unlim- 
ited amount of the pure gene can be obtained by growing 
the vector and extracting the DNA. Vectors can also carry 
recombinant DNA molecules in yeast, higher plant cells, 
and human cells. The vectors most frequently used in 
mammalian cells are the small DNA viruses SV40 and 
polyoma or the slightly larger papilloma viruses that can 
grow as plasmids. Retroviruses are reminiscent of trans- 
ducing bacteriophages in that they enter a cell, their RNA 
is copied into DNA, and then inserted into the host chro- 
mosomes. Defective retrovirus vectors, which can spon- 
sor DNA copying and insertion but cannot reproduce 
themselves, promise to be a successful means of gene ther- 



apy for individuals with single genetic defects that may be 
treated in somatic (body) tissue. In plant cells, the most 
common vector is the Ti plasmid, whose host is Agrobac- 
teriutn tumefaciens; this bacterium fuses with and trans- 
fers the recombinant DNA to the plant cell. 

For the cell biologist, the availability [through the use 
of recombinant DNA technology] of unlimited amounts 
of a pure gene offers rich opportunities for chemical and 
biological study. Access to vectors in yeast and in cul- 
tured mammalian cells affords the additional possibility 
of testing the biological functions of particular eukaryotic 
DNA sequences in a variety of eukaryotic cells. 

Industrial microbiologists employ recombinant DNA 
techniques to engineer bacteria and other easily cultured 
organisms to make proteins for use in medicine, agricul- 
ture, and research. A number of viral proteins important 
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in immunizations (for example, against the foot-and- 
mouth virus in cattle or the hepatitis virus in humans) 
have already been synthesized in E. coli, as have several 
hormones and enzymes (among them, insulin, growth 
hormone, and tissue plasminogen activator [TPA], which 
is used to combat heart attacks). The vectors that direct 
such programmed protein synthesis, called expression 
vectors, allow the experimenter to take advantage of bac- 
terial genetic tricks that increase mRNA synthesis to pro- 
duce large quantities of a desired protein. 

The Polymerase Chain Reaction Amplifies 
Specific DNA Sequences in a Mixture 

A new procedure called the polymerase chain reaction 
(PCR) can selectively and repeatedly replicate selected 
segments from a complex DNA mixture (Figure 6-35). 
This way of amplifying rare sequences from a mixture has 
vastly increased the sensitivity of genetic tests. 

In a typical application of PCR, DNA from a small 
sample of blood is cut into segments with a restriction 
endonuclease and denatured into single strands. Oligonu- 
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A Figure 6-35 The polymerase chain reaction (PCR). Taq 
polymerase, a heat-resistant DNA polymerase from Thermus 
aquaticus, is used to extend primers between two fixed 
points on a DNA molecule. All the components for chain 
elongation (primers, deoxynucleotides, and polymerase) are 
heat-stable. Thus multiple heating and cooling cycles result in 
alternating DNA melting and synthesis. DNA between the 
recognition sites of the two oligonucleotide primers accumu- 
lates exponentially. Overnight, it may be amplified as much 
as a millionfold. 



cleotide probes complementary to the 3 ' ends of the DNA 
segment to be amplified are prepared. The probe is added 
in great excess to the denatured DNA at a temperature 
between 50° and 6d°. The total genomic DNA sample, 
which is at a low concentration, remains denatured but 
the specific oligonucleotide probe hybridizes with its cor- 
rect site on the DNA. The hybridized probe will then 
serve as a primer for DNA chain synthesis, which begins 
upon addition of a supply of deoxynucleotides and a tem- 
perature-resistant DNA polymerase obtained from Ther- 
mus aquaticus (a bacterium that lives in hot springs). This 
enzyme (called the Taq polymerase) can extend the prim- 
ers at high temperatures (up to 72°). When synthesis is 
complete, the whole mixture is heated further (to 95°) to 
melt the newly formed DNA duplexes. When the temper- 
ature is lowered again, another round of synthesis can 
take place because excess primer is still present. This cycle 
of synthesizing and remelting can be repeated to amplify 
the sequence of interest. At each round, the number of 
copies of the sequences between the primer sites is dou- 
bled and therefore the desired sequence increases expo- 
nentially. 

The polymerase chain reaction allows specific DNA 
regions from a tiny sample to be examined quickly. PCR 
is already in use as a diagnostic procedure in human ge- 
netics. In basic research, PCR allows recovery of entire 
sequences between any two ends whose sequences are 
known. 

Controlled Deletions and Base- 
Specific Mutagenesis of DNA 

The availability of pure DNA in unlimited amounts has 
permitted a variety of chemical and enzymatic techniques 
for altering DNA to be developed. The practice of genet- 
ics no longer depends on isolating naturally occurring 
mutant organisms; DNA can be changed in the test tube 
and reinserted into cells. Thus deletions and mutations 
can be introduced into genes. Determining the effects on 
protein structure and changing DNA sequences that may 
function as genetic regulatory or control elements are two 
of the most important uses of these techniques. 

Two techniques for introducing mutations — the dele- 
tion of a short DNA sequence and the alteration of a 
single base— are illustrated in Figure 6-36. The function 
of the mutant DNA— whether it is a deletion mutant 
(Figure 6-36a) or a point mutant (Figure 6-36b) — can be 
tested by reintroducing it into a cell by injection or trans- 
formation (Chapter 5). The power of this approach is 
that without knowing the role of a particular sequence 
beforehand, the experimenter can determine its function 
by altering its structure and reintroducing it into the or- 
ganism. Charles Weissman has termed these practices 
"reverse genetics." 
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Synthetic Peptide and 
Nucleotide Sequences: Their 
Use in Isolating and 
Identifying Genes 

As more and more primary sequences of proteins and 
nucleic acids become known, the special importance of 
certain short sequences— regulatory signals in nucleic 
acids and functional subsections, or "domains," in pro- 
teins—become more apparent. These sequences can be 
chemically synthesized. With such fragments, the func- 
tion of a part of a protein, rather than the whole protein, 
can be tested or altered oligonucleotides can be inserted 
into normal cloned DNA sequences to study the effects of 
specific mutations (see Figure 6-36). 

Another extremely valuable aspect of synthetic oligo- 
nucleotides and peptides is that they make it possible to 
isolate whole genes and pure proteins, respectively. Be- 
cause the genetic code is universal, a nucleic acid se- 
quence can be used to predict the exact protein sequence 
it encodes; with less certainty (due to degeneracy in the 
code), a peptide sequence can be used to predict the ap- 
proximate nucleic acid sequence that encodes it. Thus it 
has become feasible to go back and forth between the 
chemical languages of nucleic acids and proteins to ob- 
tain additional information about a polymer of one type 
or the other (Figure 6-37). 

For example, if an mRNA region for a protein that is 
not yet isolated is cloned and sequenced, a synthetic pep- 
tide that is part of the protein can be prepared and used to 
provoke an antibody that will react with a protein con- 
taining that peptide. With such an antibody, the previ- 
ously unisolated protein corresponding to the already iso- 
lated RNA can be identified in cells and purified. A 
reciprocal selection is also possible: if a protein has been 
purified and a short region of peptide sequence is avail- 
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< Figure 6-36 In vitro mutagenesis: constructing DNA 
deletions and point mutations through the use of recombi- 
nant DNA techniques, (a) Deletions are made in cloned DNA 
in a plasmid by removing entire sections of DNA between 
two restriction sites or by cutting at a single restriction site 
and using the exonuclease Bal 31, which removes nucleotides 
from both ends of a cut double-stranded DNA molecule. 
Deletions of various lengths are chosen from a collection of 
such truncated molecules, (b) The two strands of a cloned 
DNA are separated, and a chemically synthesized oligonucle- 
otide primer (see Figure 6-39) that is mismatched at a de- 
sired site is hybridized to one of the DNA strands and then 
extended by a DNA polymerase. Each strand of the new 
double-stranded molecule is copied during replication to pro- 
duce a mixed population of the original DNA and mutants, 
which are then separately cloned. 
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_ f then oligonucleotides coding for that amino acid 
uence can be synthesized and used to screen a genomic 
, cDNA library for the particular DNA sequence. 
%The degeneracy of the genetic code is an important 
nsideration in choosing peptides from which to recon- 
ct partial mRNA sequences. For example, peptides 
Containing arginine, leucine, or serine (six codons each) 
to be avoided if possible. The best amino acids for 
ing such probes are tryptophan and methionine (one 
on apiece) and phenylalanine, tyrosine, histidine, as- 
partic acid, glutamic acid, asparagine, and glutamine 
|Wo codons each). The number of oligonucleotides that 
"ay<e to be synthesized to be certain of a perfect match 
fith the native mRNA is multiplicative; for example, if a 
robe is to represent six amino acids with a total of 2, 3, 
1, 2, 2 codons, then 48 separate sequences are re- 
uired. 

^Techniques for the chemical synthesis of peptides 
igure 6-38) have been available for some time; tech- 
ques for DNA oligonucleotide synthesis (Figure 6-39) 
e also in wide use. The basic logic of these techniques is 
Similar, although the chemistry is different. Note that 
uring chemical synthesis, peptide chains grow from the 
fcbxyl terminus to the amino terminus and DNA 
a|ms grow from the 3' to the 5' end. Both directions are 
iposite to the directions of biosynthetic reactions in cells 
cell extracts. 
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Isolate protein (an enzyme or 
other biologically active 
protein — e.g., a hormone) 

Obtain partial amino acid 
sequence 

Make oligonucleotides that 
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sequence 
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Figure 6-37 It is now possible to identify an mRNA of 
^ terest (say, an mRNA present in only one part of the 
^jpn) and to use it to isolate the protein it encodes without 
Sowing the function of that protein. On the other hand, it 
.Possible to sequence part of a protein that has a specific 
Section (say, an enzyme or a growth factor) and then to 
|tttnesize an oligonucleotide that can be used to identify and 
isolate the gene that encodes the complete protein. 
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A Figure 6-38 Solid-phase peptide synthesis. The first 
amino acid of the desired peptide is attached at its carboxyl 
end by esterification to a resin. The amino group of the first 
amino acid in the peptide under construction is blocked by 
the attachment of a *e/*-butyloxycarbonyl group (yellow), 
which is removed by treatment with trifluoroacetic acid 
{CF3COOH). The resulting free amino group forms a peptide 
bond with a second amino acid, which is presented with a 
reactive carboxyl group and a blocked amino group, together 
with the coupling agent dicyclohexylcarbodiimide (DCC). 
The process is repeated until the desired product is obtained; 
the peptide is then chemically cleaved with hydrofluoric acid 
(HF) from the resin. [See R. B. Merrifield, L. D. Vizioli, and 
H. G. Boman, 1982, Biochemistry 21:5020.] 
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A Figure 6-39 Synthesis of oligonucleotides. The first nu- 
cleotide (monomer 1) is bound to a glass support by its 3' 
hydroxyl; its 5' hydroxyl remains available. The synthesis of 
the first internucleotide link is carried out by mixing mono- 
mer 1 with monomer 2, which contains a reactive 3'- 
diisopropyl phosphoramidite [(IP) J with attached methyl 
group (Me), a nucleotide derivative that has the blocking 
group 4',4'-dimethoxytrityl (DMT) bound to its 5' hydroxyl 




Base 2 



Repeat process 

> » > Oligonucleotide 



0\ Base 1 




In the presence of a weak acid, the two nucleotides couple to 
form a phosphodiester with phorphorus in a trivalent state 
Oxidation by iodine (I 2 ) yields a phosphotriester in which 
the P is pentavalent; detritylation with zinc bromide (ZnBr 2 ) 
is carried out, and the process is repeated. The methyl 
groups on the phosphates are all removed at alkaline pH 
when synthesis is finished. [See S. L. Beaucage and M. H 
Caruthers, 1981. Tetrahedron Letters 22:1859.] 



Summary 

An indispensable adjunct of modern molecular cell biol- 
ogy is the use of isotopes to label biologically important 
molecules. The isotopes may be radioactive (most com- 
monly used are 3 H, 14 C, and 32 P) or density-labeled (for 
example, 15 N or 13 C). These tracers are widely used in 
cell-free biochemical experiments and in the observation 
of metabolic events within cells. Important considera- 
tions in the use of isotopes include the energy of the emit- 
ted particle during radioactive decay, the speed at which 
various labeled macromolecular precursors enter the cell, 
and the extent of exchange between compounds in the 
cell and the medium. For example, tritiated ( 3 H) com- 
pounds give the best autoradiographic images because the 
emitted f3 particle has a low energy and the image on the 
photographic emulsion is better defined. 

Pulse-chase experiments using labeled amino acids or 
thymidine to study the synthesis of proteins or DNA can 
produce clear results because amino acids are exchanged 
between the cell and the medium within a minute or two; 
the thymidine enters a very small pool that is quickly con- 



sumed by cell growth. However, pulse-chase experiments 
with labeled RNA precursors are much less effective be- 
cause ribonucleosides enter a large intracellular pool that 
is slowly consumed. 

Techniques for separating purified molecules from cells 
have reached the level of a high art. In addition to the 
many varieties of chromatographic procedures, two basic 
methods— centrifugation and electrophoresis— are fre- 
quently applied to problems in molecular cell biology. 
Both techniques are most useful in separating molecules 
according to chain length. Separations of very large mole- 
cules that differ by less than 1 percent in size are routine. 
In addition, separation in two dimensions (by size and by 
charge) allows the total protein content of cells to be re- 
solved into more than 5000 individual components. The 
use of electrophoresis to separate nucleic acids on the 
basis of size has become one of the most common labora- 
tory procedures. In mixtures of chains of 500 nucleotides 
or less, chains of every length can be separated. These 
nucleic acid fragments can now be sequenced with such 
facility that DNA stretches thousands of nucleotides long 
are typically sequenced within days. Protein sequencing 
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of shorter peptides has been entirely automated, as has 
the chemical synthesis of oligonucleotides and peptides of 
50 units or more in length. 

Two aspects of nucleic acid biochemistry — molecular 
hybridization and nucleic acid enzymology — used in con- 
junction with microbial genetics have spawned an array 
of revolutionary techniques for identifying, cloning, and 
producing natural and mutant nucleic acid sequences. 
Molecular hybridization (both RNA-DNA and DNA- 
DNA), the fundamental method of testing the identity of 
a nucleic acid sample, underlies many of these applica- 
tions. The detection of a single gene representing perhaps 
as little as one part in 10 6 of the total human genome is 
routinely carried out by a hybridization procedure known 
as the Southern blot. Especially sensitive are the Northern 
blot, which detects specific mRNA, and the Western blot, 
which employs antibodies to detect individual proteins. 

Among the most important discoveries that allowed 
gene cloning was the recognition of the restriction endo- 
nucleases that cut DNA at characteristic restriction sites 
of 4-8 bp, thereby generating reproducible fragments 
from any genome. Enzymes that synthesize DNA and 
RNA are widely available in highly purified forms, as are 
enzymes that add to or remove nucleotides from the ends 
of nucleic acids and enzymes that join DNA segments. 
The clever use of these enzymes coupled with a deep un- 
derstanding of microbial genetics that provides exquis- 
itely designed selectable vectors to receive tailor-made 
pieces of DNA has made recombinant DNA experiments 
commonplace. Synthetic oligonucleotides allow planned 
deletion and mutation of genes by substitution of se- 
quences in recombinant DNA. Today, any gene can be 
purified and the functional regions of its DNA sequences 
can be explored by reintroducing the DNA into cells and 
into whole organisms. As subsequent chapters will show 
these fantastic techniques have completely reshaped the 
way biology is carried out today. 
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